Running head: Experimental set-up and data analysis
Keywords: DNA-SIP, RNA-SIP, amplicon sequencing, omics, network analysis
Introduction
The success of any lab experiment hinges on a thoughtful design of the experimental system, careful execution of protocols and statistically-sound data analysis. While SIP protocols have matured and become standardised over the past 20 years since their introduction, what surrounds the gradient generation and fractionation, i.e., the experimental design and data analysis, have been somewhat neglected. Other chapters in this book provide detailed protocols on how to perform SIP in the lab and how to analyse the data using specific methods. This chapter, on the other hand, discusses general considerations in conceptualising a SIP experiment, designing the experimental set-up and choosing the right analysis method. The focus here is on DNA- and RNA-SIP experiments since these are the most flexible and most widely-used forms of SIP. Table \ref{tab:design_considerations} summarises the main points to consider during each of the various steps in designing a SIP experiment.
Choice of stable isotope
Every SIP experiment is based on incubating the sample in the presence of a heavy isotope labelled substrate. In theory, every element that is present in the target biomolecule -- DNA, RNA, phospholipid-derived fatty acids, or proteins -- can be labelled and therefore be used in a SIP experiment. The only exception is, of course, phosphorus for which the common form -- 31P -- is the only stable isotope that exists. In practice, however, SIP experiments almost exclusively use 13C as the isotope of choice, with a tiny minority using 18O and 15N. The choice of substrate and stable isotope as labelling compounds in a SIP experiment is of course directly related to the metabolic process or microbial guild of interest. Naturally, in SIP target microbes can only be isotopically labelled through assimilatory processes. This is somewhat unfortunate because many of the microbially-mediated biogeochemical processes of interest are energy-yielding dissimilatory processes, involving only electron transfer between two compounds and leave no trace in the biomass. In such cases, the microbial guild of interest can only be labelled indirectly through an assimilatory process that is powered by the dissimilatory process of interest (e.g., using 18O-H2O or 13C-CO2 as general substrates for all active organisms and for autotrophs, respectively).
Beyond the question of which biological process or microbial target group to study, the different stable isotopes used for SIP differ in their ability to label nucleic acids and therefore lead to buoyant density (BD) changes. Table \ref{tab:added_neutrons} lists and compares the number of additional neutrons gained per nucleotide in a DNA or RNA molecule by replacing all the atomic positions of a particular element with its heavier stable isotope. The table shows that theoretically the highest mass increase from labelling is achieved by using 18O, with added 12 or 14 neutrons on average for a hypothetical DNA or RNA molecule, respectively. This is, of course, thanks to the fact that labelling with 18O adds two neutrons per atom compared to only one for either 13C, 15N or D, therefore leading to higher overall mass increase despite the lower number of atoms in the molecule. In contrast, N is, unfortunately, the rarest in nucleic acids compared to C, O or H and labelling with 15N can lead to a maximum of 3.75 added neutrons per base, on average, or 2.5 times less in mass increase compared to labelling with 13C. This was confirmed experimentally already over 40 years ago when it was shown that fully 15N-labelled DNA in CsCl has a BD gain of ca\(.~\)0.016 g ml−1 compared to a BD gain of ca\(.~\)0.036 g ml−1 with 13C \cite{birnie_isopycnic_1978}. Similarly, RNA fully labelled with 15N showed a BD gain of ∼0.015 g ml−1 \cite{Angel_2017} compared to 0.035 for 13C \cite{Lueders2004}. The lower maximum mass addition to DNA and RNA through 15N-labelling means a smaller shift of labelled nucleic acids away from unlabelled nucleic acids in an isopycnic gradient compared to 13C-labelling. Still, this more modest shift in BD is nevertheless sufficient to detect labelling in DNA originating from a single organism, as was shown already in the classical work of Meselson and Stahl \cite{Meselson_1958}. However, for DNA-based SIP this creates a major challenge since double-stranded DNA migrates in a BD gradient not only as a function of its mass but also as a function of its hydration state. The latter is ultimately determined by the G+C content of the DNA and causes an undesired migration of unlabelled high-GC DNA towards the denser regions of the gradient \cite{Rolfe_1959}. Already in the first attempts to develop 15N-SIP, it was noticed that due to the relatively small migration of 15N-labelled DNA, unlabelled DNA with high-G+C content could overlap with even fully-labelled DNA of lower G+C content, and obscure the ability to differentiate labelled from unlabelled taxa \cite{Cupples_2007,Youngblut_2014}. This is further intensified by the fact that A-T base pairs contain only seven nitrogen atoms compared to eight in a G-C base-pair, resulting in a lower, albeit minor labelling of the A-T base pair \cite{Cadisch_2005a}.
Surprisingly, while 18O labelling should theoretically increase the mass of DNA by 23% and of RNA by 47% compared to labelling with 13C, in practice the observed shifts in BD in 18O-SIP gradients are not much different than in 13C-SIP gradients (0.04 g ml-1)\cite{Aanderud_2011,Angel_2013}, indicating that not all positions can be replaced with a heavy isotope.
Deuterium has been used in SIP experiments coupled with either Raman microspectroscopy \cite{Berry_2014} or metabolomics \cite{Baran_2017}, but because of the toxicity of deuterated water (heavy water) at high concentrations, it is probably not suitable for DNA or RNA-SIP.
Considering these, it is easy to understand why carbon is the most widely used isotope in SIP. Carbon is abundant enough in biomolecules to allow for easy labelling. In many cases, carbon-based substrates are used for both assimilatory and dissimilatory processes in the cell, so biomass labelling is easily achieved using any of a selection of different substrates. In contrast, many N-transforming processes are dissimilatory, while at the same time many N-assimilation processes are common between different functional groups of microorganisms and therefore provide relatively little differentiating power. Similarly, oxygen is also found abundantly in various terminal electron acceptors used for respiration, which are therefore unsuitable for SIP, or alternatively in water, which is assimilated into the biomass by all known organisms.