Galaxy formation and the structure of the Universe Marcel van Daalen Galaxy formation and the structure of the Universe Proefschrift ter verkrijging van de graad van Doctor aan de Universiteit Leiden, op gezag van de Rector Magnificus prof. mr. C. J. J. M. Stolker, volgens besluit van het College voor Promoties te verdedigen op dinsdag 9 december 2014 klokke 16:15 uur door Marcel Pieter van Daalen geboren te ’s-Gravenhage in 1986 Promotiecommissie Promotores: Prof. dr. J. Schaye Prof. dr. S. D. M. White (MPA, Garching) Overige leden: Prof. dr. M. Franx Prof. dr. K. H. Kuijken Prof. dr. H. J. A. Röttgering Dr. H. Hoekstra Dr. A. R. Zentner (University of Pittsburgh, USA) “Space is big. You just won’t believe how vastly, hugely, mind-bogglingly big it is. I mean, you may think it’s a long way down the road to the chemist’s, but that’s just peanuts to space.” – Douglas Adams, The Hitchhiker’s Guide to the Galaxy The front over the cover shows an image of the Coma cluster over a silhouette of Leiden, while the back shows an image of the AGN−WMAP7−L100N512 simulation from the OWLS project over a silhouette of Munich. Coma cluster image by Jim Misti, all other cover images and design by the author. Table of contents 1 2 3 Introduction 1.1 Large-scale properties of the Universe . 1.2 Testing the standard cosmological model 1.2.1 Linear structure formation . . . 1.2.2 Non-linear evolution . . . . . . . 1.2.3 The role of galaxy formation . . 1.3 Numerical simulations . . . . . . . . . . 1.4 This thesis . . . . . . . . . . . . . . . . . . . . . . . . 1 2 3 4 8 9 11 13 The Effects of Galaxy Formation on the Matter Power Spectrum: A Challenge for Precision Cosmology 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.1 The reference simulation . . . . . . . . . . . . . . . . . . . . 2.2.2 Other models . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.3 Power spectrum calculation . . . . . . . . . . . . . . . . . . 2.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.1 Comparison of a dark matter only simulation to models . . 2.3.2 The relative effects of different baryonic processes . . . . . 2.3.3 Contributions of dark matter, gas and stars . . . . . . . . . 2.3.4 The back-reaction of baryons on the dark matter . . . . . . 2.3.5 A closer look at the effects of AGN feedback . . . . . . . . 2.4 Comparison with previous work . . . . . . . . . . . . . . . . . . . . 2.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.A Convergence tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.A.1 Box size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.A.2 Numerical resolution . . . . . . . . . . . . . . . . . . . . . . 2.B Tabulated power spectra . . . . . . . . . . . . . . . . . . . . . . . . 15 16 18 19 21 23 25 26 28 33 36 36 39 43 45 45 46 49 The Impact of Baryonic Processes on the Two-Point Correlation Functions of Galaxies, Subhaloes and Matter 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.1 Simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.2 Calculating correlation functions . . . . . . . . . . . . . . . 3.2.3 Linking haloes between different simulations . . . . . . . . . 3.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.1 Clustering of galaxies . . . . . . . . . . . . . . . . . . . . . 3.3.2 Clustering of subhaloes . . . . . . . . . . . . . . . . . . . . 3.3.3 Accounting for the change in mass . . . . . . . . . . . . . . 51 52 54 54 55 56 57 57 61 66 . . . . . through . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii Table of contents 4 5 6 viii 3.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.A Convergence tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.B Linked fractions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 73 77 The Effects of Halo Alignment and Shape on the Clustering of Galaxies 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2.1 Simulation and SAM . . . . . . . . . . . . . . . . . . . . . . 4.2.2 Calculation of the galaxy correlation function . . . . . . . . 4.2.3 Testing the importance of alignment and ellipticity . . . . . 4.2.4 Testing the dependence of galaxy bias on halo shape . . . . 4.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3.1 Alignment and ellipticity . . . . . . . . . . . . . . . . . . . 4.3.2 Shape-dependent galaxy bias . . . . . . . . . . . . . . . . . 4.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 82 84 84 85 85 86 87 87 91 94 The Contributions of Matter Inside and Outside of Haloes to the Matter Power Spectrum 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.1 Simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.2 Power spectrum calculation . . . . . . . . . . . . . . . . . . 5.2.3 Halo particle selection . . . . . . . . . . . . . . . . . . . . . 5.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3.1 Fractional mass in haloes . . . . . . . . . . . . . . . . . . . 5.3.2 Halo contribution to the power spectrum . . . . . . . . . . 5.4 Summary & conclusions . . . . . . . . . . . . . . . . . . . . . . . . 97 98 100 100 100 101 102 102 104 113 The Galaxy Correlation Function as a Constraint on Galaxy Formation Physics 6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2.1 Estimating the correlation function . . . . . . . . . . . . . . 6.2.2 The SAM and MCMC . . . . . . . . . . . . . . . . . . . . . 6.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3.1 Comparison with observations . . . . . . . . . . . . . . . . . 6.3.2 Change in parameters . . . . . . . . . . . . . . . . . . . . . 6.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117 118 119 119 128 131 131 134 137 References 139 Table of contents Nederlandse samenvatting De vorming van sterrenstelsels en Structuurvorming . . . . . . Feedback . . . . . . . . . . Numerieke simulaties . . . . Clustering . . . . . . . . . . In dit proefschrift . . . . . . . . . de structuur van het . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Universum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147 147 147 149 149 152 153 Publications 155 Curriculum vitae 157 Acknowledgements 159 ix 1 Introduction Our understanding of both the large-scale properties of our Universe and the processes through which galaxies form and evolve has greatly improved over the last few decades, thanks in part to new observational probes and more refined numerical simulations. While the precision with which we measure the cosmic background radiation, the distribution of matter and the properties of galaxies continues to increase, we are improving our simulations to include more physical processes and to resolve ever smaller scales. We are learning just how deeply cosmology and galaxy formation are intertwined, and the need to model them simultaneously in order to advance both fields is growing rapidly. In this thesis, we investigate how galaxy formation can alter the structure of the Universe on a large range of scales, and how measuring the structure of the Universe can in turn help us to constrain models of galaxy formation. Introduction 1.1 Large-scale properties of the Universe About 13.8 billion years ago, the Universe came into existence in an event we call the Big Bang. From that moment on, it has been continually expanding. As a consequence of the Universe being both isotropic and homogeneous on large scales, the rate of its expansion at any particular time can be related to the current one through a simple function of only four parameters.1 These are the present-day matter content of the Universe, Ωm,0 ; its radiation content, Ωγ,0 ; its curvature, Ωk,0 ; and the contribution of the cosmological constant, ΩΛ,0 , which we presently refer to mainly as dark energy. Since Ωk,0 is defined such that the sum of these parameters is by definition equal to unity, this means that only three are independent. Determining the values of these parameters with ever-growing precision is one of the main aims of cosmology, as together with the Hubble constant, H0 (the present rate of expansion), they fully determine the evolution of the Universe as a whole. Currently, the strongest constraints on these numbers come from observations of the oldest light in the Universe: the cosmic microwave background, or CMB, a relic from the Big Bang. The CMB was last scattered when the Universe was only about 380, 000 years old, at a time when the expansion had cooled the Universe down sufficiently for protons and electrons to combine and form neutral hydrogen and helium (an event called “recombination”), allowing light to travel freely for the first time. It shows us the Universe at the earliest time we could possibly observe its light, therefore informing us about the initial conditions, from which any successful model should be able to explain the properties of the Universe as we see it today. The CMB is incredibly smooth, indicating a very high level of homogeneity – the relative variations in the density of baryonic matter (or, “normal”, visible matter) we see in it are of the order of 10−5 (see Figure 1.1). In order for these fluctuations to grow into the galaxies we see today, most matter in the Universe needs to be (cold) dark matter2 , which is indeed what different observations indicate. Through very precise CMB measurements using e.g. the Wilkinson Microwave Anisotropy Probe (WMAP) and Planck satellites, the dark matter fraction and a host of other cosmological parameters (including all parameters mentioned thus far) can be determined with ever greater accuracy (e.g. Hinshaw et al., 2013; Planck Collaboration et al., 2013). These measurements indicate, for example, that the 1 In more detail, the evolution of the Universe also depends on the equations of state (the ratios of pressure and density) that characterise these parameters. The equation of state of the radiation content depends on the number of relativistic particle species, while that of the cosmological constant depends on the nature of dark energy, both of which are not completely certain yet. 2 If dark matter is “warm” or “hot”, this means that it consists of particles with velocities that are sufficiently high at the time of decoupling to stream out of density fluctuations, thus preventing their growth. This process, called free streaming, sets a minimum scale above which fluctuations can form and depends mainly on the masses of the particles, and their cross sections. While we already know that not all dark matter can be hot, warm dark matter is not yet completely ruled out, although its particle mass is strongly constrained (e.g. Viel et al., 2013). 2 1.2 Testing the standard cosmological model through clustering Figure 1.1: The Cosmic Microwave Background (CMB) as captured by the Planck satellite. This is a snapshot of the Universe only 380, 000 years after the Big Bang. It is homogeneous and isotropic on large scales, but very small fluctuations exist nonetheless. The colour scale shows relative differences of order 10−5 . Universe is “flat” (i.e. space is not curved but Euclidean), and although everything we observe directly is baryonic matter, the Universe is in fact dominated by dark matter and dark energy. We refer to the model that contains all these ingredients as the ΛCDM-model, currently the standard model of Big Bang cosmology. 1.2 Testing the standard cosmological model through clustering As we mentioned, in order for any model of the Universe to be truly successful, it has to be able to explain all that we see on large scales today. This includes, for example, the current (accelerating) rate of expansion, but also the different galaxy populations we observe and the distribution of matter. The latter is the main focus of this thesis: the clustering of matter, i.e. the structure of the Universe. The matter distribution is completely determined by the initial conditions of the Universe; therefore, it is in principle possible to derive all cosmological parameters by examining how matter is organised in the present-day Universe, provided one understands how structure forms and evolves. In what follows, we present a simplified view of the formation of structure, starting from the very first density fluctuations in our Universe. 3 Introduction Figure 1.2: Illustration of the density field as a field of fluctuations. As a one-dimensional analogue, we show how seven independent harmonic waves with different amplitudes, wavelengths and phases (indicated by dotted lines) together add up to the fluctuation field indicated by the solid curve. Notice that because of contributions of large waves, high-δ fluctuations are often found close together (i.e. they cluster). 1.2.1 Linear structure formation Let us consider some part of the Universe with a mean density ρ̄. At3 any threedimensional position x we can calculate a local density ρ(x), which may differ from the mean. We can now define the density contrast field, or density fluctuations field, as: ρ(x) − ρ̄ δ(x) = . (1.1) ρ̄ If δ is positive at some position x, this means that there is a local overdensity. Under the influence of gravity this overdensity will grow4 , attracting more and more matter and thereby forming structure. In order to understand what happens as these overdensities grow, let us first consider the simplest picture of structure formation: the linear one. Linear structure formation applies when the density fluctuations are very small, i.e. δ " 1. This is indeed valid for the early Universe, which was extremely homogeneous and 3 To be precise, a density can never truly be defined at a singular location: one needs to assume some smoothing scale. 4 For baryonic matter, overdensities may be stable against collapse due to pressure forces. Dark matter – which dominates the matter content of the Universe – does not feel pressure, however, and is able to form structure more freely. We will briefly return to this point later in this chapter. 4 1.2.1 Linear structure formation therefore contained only small fluctuations.5 As density fluctuations influence each other through gravity, they do not evolve independently. However, if we consider each density fluctuation as a superposition of plane waves, then in the linear regime these waves do evolve independently. Additionally, this view allows us to consider the growth of structure in a statistical sense, at given scales instead of at given locations, which is far more meaningful in a largely isotropic and homogeneous Universe. An illustration of this wave picture is shown in Figure 1.2. A one-dimensional density fluctuation is shown as a solid black line. Each such fluctuation can be uniquely decomposed into harmonic waves with different amplitudes, wavelengths and phases; in this case, into the seven waves shown as dotted lines. Notice that, mainly due to the contributions of the longer waves, high-δ fluctuations tend to cluster – i.e. they are likely to be found close together. As we will see later, this has large consequences for how matter is organised today. The relation between the spatial fluctuations δ and the density waves δk can be expressed through a discrete Fourier transform: ! δk e−ik·x , (1.2) δ(x) = k where k is the wave vector, related to the wavelength by λ = 2π/|k|. We can now quantify the amount of structure on any Fourier scale k = |k| by squaring the amplitudes of these density waves, averaging over all waves with the same wavenumber k to obtain a statistic called the matter power spectrum: # " (1.3) P (k) ≡ |δk |2 k . Inflationary theory6 predicts that the primordial power spectrum should scale as a power law: (1.4) P (k) ∝ k ns , with a spectral index ns that is very close to unity, meaning that to good approximation the fluctuations in the gravitational potential were scale invariant.7 How the linear density fluctuations evolved from primordial times is best described in terms of the scale factor of the Universe, a(t), which is a dimensionless length scale that gives a measure of the size of the Universe when it has an age t. By definition, a(t0 ) = 1, where t0 is the current age of the Universe. Since the Universe is continually expanding, a was smaller in the past, and infinitesimally 5 Note that a completely homogeneous, uniform density field cannot exist: at the very least, microscopic variations in density must exist due to quantum mechanics. Incidentally, such quantum fluctuations are expected to be the seeds of the variations we see in the CMB – stretched out to macroscopic scales by a process called inflation very shortly after the Big Bang – and consequently of all structure existing today. 6 Recently, the first direct evidence for inflation was found by the BICEP-2 team in the form of a gravitational wave signal in the CMB, see BICEP2 Collaboration et al. (2014). 7 The scale-invariant power spectrum is also called the Harrison-Zel’dovich power spectrum. Current CMB measurements by the Planck satellite indicate ns ≈ 0.96. 5 Introduction small at the moment after the Big Bang. It is related to the redshift z through a = 1/(1 + z); at z = 1, the distance between two points in the Universe was thus twice as small as it is now. For reference, the redshift of the CMB/recombination is z ≈ 1100. As we mentioned at the beginning of this chapter, the evolution of the Universe is determined by its constituents, most importantly matter, radiation and dark energy.8 Consequently, the growth of fluctuations at any time depends on which of these constituents dominates. In our simplified picture, a linear density fluctuation in some spherical volume is expected to grow as:9 δ∝ 1 , ρ̄a2 (1.5) where ρ̄ is the mean (energy) density of the dominant component of the Universe at some time t. The amount of matter in the Universe is (to a very high degree) constant, meaning that its density just scales inversely with the volume of the Universe: ρ̄m ∝ a−3 . (1.6) Therefore, when matter dominates the Universe, linear fluctuations grow as δ ∝ a. The energy density of radiation, on the other hand, does not only scale inversely with the volume, but by an additional factor a since its energy is not conserved due to photons being redshifted during expansion. Hence: ρ̄γ ∝ a−4 , (1.7) meaning that during radiation domination linear fluctuations may grow as δ ∝ a2 . Finally, for dark energy, which is a property of space itself and therefore has a constant density, we have: ρ̄Λ ∝ a0 , (1.8) meaning that when dark energy dominates, density fluctuations cannot grow at all: they are damped by the expansion of the Universe. However, this damping is not exclusive to the Λ-dominated era. This is related to the existence of the horizon, the maximum distance between causally connected regions. If two regions are farther apart than this, i.e. farther than light (and gravity) could have travelled within the age of the Universe, then they could not have been in causal contact. Fluctuations on scales smaller than the horizon can be damped if the Universe expands faster than they collapse, which is the case during the era of radiation domination. In short, this limits the growth of linear fluctuations to at most logarithmic growth – much slower than the otherwise power-law growth10 . A summary of the growth rates for both sub- and superhorizon fluctuations is shown in the table at the top of the next page. Here λ is the wavelength of a fluctuation and rH is the horizon scale. Since the densities of each of the constituents of the Universe scales differently with the scale factor of the Universe, it is clear that each dominates in some era. 8 As 6 observations show the Universe to be almost completely flat, geometrically speaking, these 1.2.1 Linear structure formation λ < rH λ > rH γ-dom. m-dom. Λ-dom. damped δ∝a damped δ∝a 2 δ∝a damped Radiation, which scales most steeply with a, must have dominated when the Universe was very young (i.e. a was very small), followed by matter, followed by dark energy. Indeed, radiation dominated the content of the Universe up to a redshift of z ≈ 3600 (corresponding to an age of the Universe of approximately 50, 000 years), and dark energy has been dominating since z ≈ 0.4 (for approximately the last 4.2 billion years), meaning that the Universe was matter-dominated during the majority of its existence, allowing new structure to form. Combining this insight with the table shown above, we conclude that a special scale exists, namely the scale of the horizon between the radiation and matter dominated eras, rH,eq . This scale depends on several cosmological constants, but roughly11 rH,eq ∼ 102 h−1 Mpc. Fluctuations larger than this scale were able to grow before the matter-dominated era, while smaller fluctuations were damped. Afterwards, linear fluctuations on all scales could grow at the same rate. It can be shown that the damping of subhorizon fluctuations depends on their size: smaller fluctuations were damped more strongly by the expansion of the Universe. This means that the theoretical linear power spectrum, that started out as P (k) ∝ k ns , changed shape on scales λ < rH,eq during radiation domination. Consequently, after the radiation-dominated era the power spectrum roughly looked as follows: $ n −4 k s for λ < λeq P (k, t) ∝ (1.9) k ns for λ > λeq . We show the detailed power spectrum of linear fluctuations in Figure 1.3. The exact shape and features of this power spectrum depend on all the cosmological parameters of the standard ΛCDM model, most of which we have already mentioned: besides Ωm,0 , ΩΛ,0 , Ωγ,0 , Ωk,0 , H0 and ns , these are Ωb,0 (the baryonic matter content of the Universe) and σ8 (the normalisation of the power spectrum). As all of these influence the power spectrum independently, every one of these can be determined just by measuring the linear power spectrum to very high precision. This is essentially what we try to do when observing the CMB, which makes it the single most powerful observable for understanding our Universe as a whole. Note that up until shortly before recombination, baryonic subhorizon fluctuare the only constituents of consequence. approximation is derived by considering the gravitational evolution of a linear fluctuation in an expanding Universe with mean density ρ̄, which is not trivial. 10 Which, in turn, is much slower than the exponential growth in a non-expanding Universe. 11 The unit shown here for r H,eq is the typical unit of distance used in cosmology. “Mpc” is shorthand for “megaparsec”, i.e. one million parsecs (a bit over three million light years), while “h” is the dimensionless Hubble constant, defined as h ≡ H0 /[100 (km/s)/Mpc] ≈ 0.7. 9 This 7 Introduction Figure 1.3: The theoretical matter power spectrum. The wavelength of fluctuations decreases towards the right. A dotted line shows the corresponding primordial power spectrum for ns ≈ 1. At roughly the horizon scale at matter-radiation equality, rH,eq ∼ 102 h−1 Mpc, the power spectrum turns over and the power law index asymptotes to ns − 4 ≈ −3. The dashed line shows a correction for non-linear growth at later times, which only affects scales of a few tens of Mpc or less. ations were unable to grow, even though the matter-dominated era had already begun. This is because photons were capable of dragging baryons along, damping their fluctuations.12 Therefore, up until the time of the CMB only fluctuations made up of cold dark matter (which is the dominant form of matter in our Universe) were able to grow. Afterwards, when the baryons could collapse, they followed the dark matter perturbations that were already present. The distribution of dark matter therefore dictated where stars and galaxies would form. 1.2.2 Non-linear evolution As we mentioned before, a successful theory need not (and, within reason, cannot) predict the exact distribution of matter around us in absolute terms. Rather, it 12 Related to this are the baryon acoustic oscillations (BAO). Gravity and pressure forces (the latter mainly caused by the photons) counteracted one another, causing oscillations in the baryonic fluctuations. These can be seen as small wiggles in the matter power spectrum around 100 h−1 Mpc (see Figure 1.3) and are still imprinted on structure today. 8 1.2.3 The role of galaxy formation predicts how matter is organised in a statistical sense – for example by predicting its power spectrum. Up until now, we only considered what would happen to linear density fluctuations, i.e. fluctuations that are very small (δ " 1). However, if we want to predict the clustering of matter not only in the very early Universe, but also today, we need to consider what happens to fluctuations that grow large enough to actually collapse, for which the simplified picture sketched above no longer applies. Without a theory that accurately predicts the amount of nonlinear structure as a function of scale in the Universe today, we cannot test our model against observations. Many useful insights can be gained from taking a perturbative analytical approach to non-linear structure formation. For example, it grants us expressions for the time it takes for a halo to form, the relative density at which it forms, and its final size and mass.13 However, since non-linear collapse is such a complex process – even when only considering dark matter – clustering predictions nowadays are mainly made by fitting to the results of simulations. These predictions are then compared to clustering measurements from observations in order to learn more about the underlying cosmology. The general picture of non-linear structure formation looks as follows. As fluctuations grow, they will generally not be spherically symmetric; consequently, they will collapse first along one direction, forming sheets of matter (also called pancakes, see e.g. Zel’dovich, 1970). It is around this time that fluctuations will enter the non-linear regime, meaning the approximations used in the previous subsection are no longer valid. These sheets will collapse along a second direction, forming filaments, which finally collapse to make what we call haloes, forming a cosmic web of filaments with very massive haloes at the nodes (hosting galaxy groups and clusters) and smaller ones throughout. The dark matter collapse then stops as these haloes virialise, meaning that they attain a quasi-static dynamic equilibrium between the internal gravitational forces and the random motions of their particles. Further growth then proceeds through the merging of haloes, especially in clustered environments where the probability of two haloes encountering one another is large. This non-linear evolution has important consequences for the clustering of matter, as shown by the dashed line in Figure 1.3. It is the haloes that are of most interest to us, as these are the regions where galaxies form. 1.2.3 The role of galaxy formation Haloes are the highest-density regions of dark matter, which constitute the potential wells into which gas flows. Contrary to dark matter, gas feels pressure and can radiate its energy away as photons, allowing it to cool to the centres of haloes and form stars and galaxies, which merge and grow and evolve. 13 A formalism generally referred to as (extended) Press-Schechter theory, see e.g. Press & Schechter (1974), Bond et al. (1991) and Sheth, Mo & Tormen (2001). 9 Introduction When we look out into the night sky, we do not see the majority of matter, which is in the shape of filaments and haloes. Instead, we see only the very peaks of the matter distribution, as this is where the galaxies reside (see bottom row of Figure 1.4). Galaxies are therefore biased tracers of the cosmic density field14 , which means that we must understand the complicated physical processes through which galaxies formed and evolved in order to be able to use them to derive cosmological information. The better we understand the galaxy bias, the better we can constrain the large-scale properties of our Universe by measuring how galaxies cluster. Galaxies are not just important to the structure of the Universe because they are biased tracers; they influence the clustering of (dark) matter as well, which we can measure through the gravitational effect of matter on light (called lensing). Through gas cooling, baryons can attain much higher densities than dark matter, and form more structure on galactic scales than dark matter alone could. The dark matter haloes respond to the formation of galaxies in their centres by contracting somewhat, thereby changing the amount of structure on small scales (e.g. Blumenthal et al., 1986). This needs to be taken into consideration when one tries to predict the clustering of matter based on the relatively simple dark matter only picture of the Universe. Even though dark matter is dominant and baryons trace it initially, they act differently on small scales. However, galaxy formation is not only more complex, but also more violent than the formation of dark matter haloes. For example, when stars die they may explode as supernovae, potentially heating up large amounts of gas, which prevents this gas from forming stars. Together, supernovae in a galaxy may drive galactic fountains of gas, ejecting the gas out of the galaxy. For small enough galaxies (occupying low-mass haloes), supernovae may even destroy the galaxy altogether. Very massive galaxies may host an active galactic nucleus (AGN) in their centre, heating mass amounts of gas and ejecting it far out of the galaxy. Because of these feedback processes, the pressure of the gas is increased and it will resist forming structure, meaning that the clustering of matter is lower than what would be expected from the simple dark matter picture. If enough gas is driven out of the galaxy, the dark matter haloes may respond in a way opposite to that we would naively expect: expanding on super-galactic scales (e.g. Velliscig et al., 2014; also see Chapters 2 and 3 of this thesis). This can even occur without feedback, due to pressure smoothing of virialised gas on large scales. Many other physical processes involved in the formation and evolution of galaxies may also influence clustering predictions. Currently, clustering measurements are becoming so precise that the need to understand the physics of galaxy formation to comparable accuracy is rapidly increasing.15 Without it, we cannot test our theoretical 14 Galaxies are often seen as biased tracers of haloes, which are in turn biased tracers of the entire matter distribution. 15 Even if we do not fully understand the physics of galaxy formation, we may still be able to selfcalibrate our models or marginalise over parameters that describe the effects of e.g. feedback on halo profiles (see e.g. Zentner, Rudd & Hu, 2008; Yang et al., 2013; Zentner et al., 2013). 10 1.3 Numerical simulations models of cosmology against observations in a meaningful way. 1.3 Numerical simulations Because of the complexity and immense range of scales involved in the formation of galaxies, numerical simulations are the only way to test our models at the precision of current and upcoming observations. Different approaches to cosmological simulations are available. For example, one could run a simulation in which one assumes all matter acts like dark matter (N-body or collisionless simulations, see Chapter 5), making it possible to simulate a given region at a far higher resolution than otherwise possible, and base the formation and evolution of the baryonic component of the Universe on these, generally assuming the dark matter is not affected by the baryons. By solving sets of coupled equations for the evolution of baryons and galaxies and using observations as constraints for the parameters involved, one can very quickly obtain predictions for other quantities on a large range of scales. This is the approach taken by semi-analytic models of galaxy formation (see Baugh, 2006, for a review on the methodology; also, see Chapters 4 and 6 of this thesis). Another way is to include the baryons in the simulation directly along with the dark matter, solving the gravitational and other physical equations involved simultaneously (hydrodynamical simulations, see Tormen, 1996, for an introduction and Springel, 2010, for a review on the method employed here; also, see Chapters 2 and 3 of this thesis). Because there are more complicated equations to solve, because there are more variables to track and because there are higher densities involved (decreasing the time steps), such simulations are often run at much lower resolution than pure N-body simulations in order to keep the computational time and memory consumption down. However, the trade-off is that fewer approximations have to be made and that the effects of the baryons on the dark matter are modelled explicitly. Since not all the relevant scales can be resolved (yet), the physics of e.g. star formation and supernova feedback have to be modelled in a comparable way to semi-analytics, which brings some uncertainties with it. It is here that much may be gained in coming years, as these physical recipes in both semi-analytical and hydrodynamical simulations are constantly being improved, leading to more realistic representations of our Universe. The matter distribution in a hydrodynamical simulation from the OWLS project (Schaye et al., 2010) referred to often in this thesis, called AGN−L100N512 (see Chapter 2), is shown in Figure 1.4. The region shown is 100 h−1 Mpc on a side and 10 h−1 Mpc thick. The distributions of cold dark matter, gas and stars are shown in separate columns, and from top to bottom cosmic time increases. For z = 127, the fluctuations are still linear, and no stars have formed yet. At z = 6, However, this requires us to model the way baryons affect clustering in some way, and may come at the cost of decreasing the statistical significance of the measurements. 11 Introduction Figure 1.4: This figure illustrates the growth of structure from a linear to a highly nonlinear state, for a comoving (meaning that we scale out the expansion of the Universe) volume 100 h−1 Mpc on a side. The slice shown here, showing the projected mass density, is 10 h−1 Mpc thick. Each column shows the evolution of a different component: from left to right, these are cold dark matter, gas, and stars. Each row shows the volume at a different cosmic time. At the starting redshift of the simulation, z = 127 (only 12 million years after the Big Bang), no significant structure has formed yet, and density fluctuations are still very small. At z = 6 (almost a billion years after the Big Bang), the dark matter is clearly collapsing and starting to form a cosmic web. Gas still traces the dark matter on the scales visible here. Galaxies, visible as small clumps of stars, have started forming at the points of highest density relatively recently. Finally, at z = 0 (the present, 13.8 billion years after the Big Bang), all the structure we see today has formed. Note that the gas no longer perfectly traces the dark matter, but is distributed somewhat more smoothly. This is mainly caused by energetic feedback processes associated with galaxy formation, heating the gas. The galaxies themselves, seen in the right-most panel, are clearly biased tracers of the overall mass distribution, having formed where the dark matter densities are highest. 12 1.4 This thesis the cosmic web has started to take shape and galaxies have formed in the more massive haloes. The gas still traces the dark matter almost perfectly on large scales. By z = 0 (the present time), the cosmic web is more pronounced and all the galaxies we see today have formed, which trace the large-scale structure of the cold dark matter. The gas has been heated by gravitational accretion shocks and by feedback from both supernovae and AGN, and is distributed somewhat more smoothly than the dark matter. 1.4 This thesis In the near future it will be possible to measure the distribution of galaxies and matter to unprecedented precision. To get the most out of these observations and to avoid unwanted biases, our theoretical models will have to match the accuracy of real-life measurements. The fields of cosmology and galaxy formation are now more tightly tied together than ever before: we need to understand the processes involved in galaxy formation to interpret the clustering of matter and tie our observations to a set of cosmological parameters. Additionally, small-scale clustering measurements – which are less sensitive to cosmology – may help us to constrain our galaxy formation models. We explore all these topics in this thesis. In Chapter 2 we investigate the effects of galaxy formation on the clustering of matter through the use of the OWLS suite of simulations (Schaye et al., 2010; Le Brun et al., 2014), in which different physical processes were varied one at a time. We compare the results of hydrodynamical simulations to those of dark matter only models, which are generally used to interpret weak lensing measurements of the matter distribution, and show that feedback from galaxy formation can have much larger effects on the matter power spectrum than previous studies have shown. We also investigate how the clustering of dark matter changes when such processes are included. Since the clustering of galaxies and the galaxy-galaxy lensing signal may be similarly affected, we also examine the two-point galaxy correlation function and the galaxy-matter cross-correlation in these simulations, in Chapter 3. We will show that efficient feedback can change the predictions by ∼ 10%, and although this shift is mainly due to the masses of both galaxies and haloes being systematically lowered, significant residual effects remain after correcting for the change in mass. Next, we explore the validity and consequences of several assumptions that are typically used in models based on the halo model and halo occupation distribution. Specifically, in Chapter 4 we investigate if and how the shapes and alignments of haloes are reflected in the clustering of galaxies, using the Guo et al. (2011) semianalytical model run on the Millennium Simulation (Springel et al., 2005). We also ask the question whether it is possible to measure this form of “assembly bias” from galaxy surveys, without knowing the distribution of dark matter. In Chapter 5 we test the postulate of halo models that all matter resides in haloes, using 13 Introduction collisionless simulations from the OWLS project. We calculate the clustering of matter in our simulations and compare it to the clustering of matter within haloes above a certain mass, exploring also the effects of using different halo definitions. Finally, in Chapter 6, we present a fast and accurate clustering estimator for use in semi-analytical models of galaxy formation. Using this halo model based estimator, it is possible to predict the projected galaxy correlation function to an accuracy of ∼ 10% using only a very small subsample of haloes, meaning it can be used efficiently while exploring the parameter space of a model. By using clustering data as a constraint in addition to the usual one-point functions (such as the stellar mass or luminosity functions), degeneracies can be removed, improving both the match of the model to multiple data sets at once and our understanding of galaxy formation. We apply our model to the semi-analytical model of Guo et al. (2013), and show how the best-fit parameters change to bring the model into agreement with the newly-added constraints. 14 2 The effects of galaxy formation on the matter power spectrum: A challenge for precision cosmology Upcoming weak lensing surveys, such as LSST, EUCLID, and WFIRST, aim to measure the matter power spectrum with unprecedented accuracy. In order to fully exploit these observations, models are needed that, given a set of cosmological parameters, can predict the non-linear matter power spectrum at the level of 1% or better for scales corresponding to comoving wave numbers 0.1 ! k ! 10 h Mpc−1 . We have employed the large suite of simulations from the OWLS project to investigate the effects of various baryonic processes on the matter power spectrum. In addition, we have examined the distribution of power over different mass components, the back-reaction of the baryons on the CDM, and the evolution of the dominant effects on the matter power spectrum. We find that single baryonic processes are capable of changing the power spectrum by up to several tens of per cent. Our simulation that includes AGN feedback, which we consider to be our most realistic simulation as, unlike those used in previous studies, it has been shown to solve the overcooling problem and to reproduce optical and X-ray observations of groups of galaxies, predicts a decrease in power relative to a dark matter only simulation ranging, at z = 0, from 1% at k ≈ 0.3 h Mpc−1 to 10% at k ≈ 1 h Mpc−1 and to 30% at k ≈ 10 h Mpc−1 . This contradicts the naive view that baryons raise the power through cooling, which is the dominant effect only for k " 70 h Mpc−1 . Therefore, baryons, and particularly AGN feedback, cannot be ignored in theoretical power spectra for k " 0.3 h Mpc−1 . It will thus be necessary to improve our understanding of feedback processes in galaxy formation, or at least to constrain them through auxiliary observations, before we can fulfil the goals of upcoming weak lensing surveys. Marcel P. van Daalen, Joop Schaye, C. M. Booth and Claudio Dalla Vecchia Monthly Notices of the Royal Astronomical Society Volume 415, Issue 4, pp. 3649-3665 (2011) Galaxy formation and the matter power spectrum 2.1 Introduction One of the aims of cosmology is to find the initial conditions for structure formation in the Universe. These can be characterised by a single set of cosmological parameters, which directly influence the formation, growth and clustering of structure, and hence the distribution of matter as we observe it today. A powerful measure of the statistical distribution of matter (and a sufficient one for the case of Gaussian fluctuations), is the matter power spectrum, P (k), where k is the comoving wave number corresponding to a comoving spatial scale λ = 2π/k. Given a sufficiently accurate model for the formation of structure, we can infer the initial, linear power spectrum from the observed, non-linear one. Moreover, as the rate of growth of structure depends on the expansion history, such a model also allows us to convert observations of the evolution of the power spectrum into measurements of other cosmological parameters such as the equation of state of the dark energy. Some of the most accurate measurements of the matter power spectrum come from studies of weak, gravitational lensing (e.g. Massey et al., 2007; Fu et al., 2008; Schrabback et al., 2010), galaxy clustering (e.g. Cole et al., 2005; Reid et al., 2010) and the Lyα forest (e.g. Viel, Haehnelt & Springel, 2004; McDonald et al., 2006). Up to a few years ago, the statistical errors were sufficiently large that one could use analytical predictions (always assuming, amongst other things, that the Universe contains only dark matter), such as those by Peacock & Dodds (1996), Ma et al. (1999) and Smith et al. (2003a). The latter used ideas from the “halo model” (e.g. Peacock & Smith, 2000; Seljak, 2000; Cooray & Sheth, 2002) to improve upon the accuracy of simpler analytical predictions. In recent years the further improvement of this model has become increasingly dependent on the results from N -body simulations, such as the derived concentration-mass relation for dark matter haloes (e.g. Neto et al., 2007; Duffy et al., 2008; Hilbert et al., 2009). If baryonic effects were negligible, then these methods would allow the matter power spectrum to be predicted with an accuracy of ∼ 1% for wave numbers k ! 1 h Mpc−1 (Heitmann et al., 2010). However, we will show here that baryonic effects are larger than this on the scales relevant for many observations. Upcoming weak lensing surveys aim to measure the matter power spectrum on scales of 0.1 < k < 10 h Mpc−1 . In order to reach the level of precision their instruments are capable of, surveys such as LSST,1 EUCLID,2 and WFIRST3 need to be calibrated using theoretical models that retain 1% accuracies on these scales (Huterer & Takada, 2005; Laureijs, 2009).4 This is, however, not as straightforward 1 http://www.lsst.org/lsst 2 http://www.euclid-imaging.net/ 3 http://wfirst.gsfc.nasa.gov/ 4 Since cosmological parameters are inferred from cosmic shear using a complicated weighting of the power spectrum over a range of scales and redshifts, the relation between the accuracy with which these parameters can be determined and the uncertainty in the models depends on the survey and is different for different parameters. Semboloni et al. (2011) present a more detailed study of the consequences of our findings for weak lensing surveys. 16 2.1 Introduction as increasing the resolution of existing N -body simulations: many authors have demonstrated that on these scales baryonic matter, which is not accounted for in currently employed theoretical models, introduces deviations of up to 10% (White, 2004; Zhan & Knox, 2004; Jing et al., 2006; Rudd, Zentner & Kravtsov, 2008; Guillet, Teyssier & Colombi, 2010a; Casarini et al., 2011a). Recent hydrodynamic simulations include many of the physical processes associated with baryons, such as radiative cooling, star formation and supernova (SN) feedback. However, the processes which cannot be resolved in simulations are generally also not entirely understood, and different prescriptions exist that aim to model the same physics. Because of this, different authors may find significantly different results even when including the same baryonic processes. Furthermore, it is not a priori clear which physical effects are capable of changing the matter power spectrum at the 1% level and should therefore be included. These modelling uncertainties may thus prevent upcoming surveys from further constraining the cosmological parameters of our Universe. Here we employ a large suite of state-of-the-art cosmological, hydrodynamical simulations from the OWLS project (Schaye et al., 2010) to systematically study the effects of various baryonic processes on the matter power spectrum over a wide range of scales, k ∼ 0.1 − 500 h Mpc−1 . These processes include metal-line cooling, different prescriptions for SN feedback, and feedback from active galactic nuclei (AGN). We will see that all of our results are heavily influenced by the inclusion of AGN feedback, which was not considered by earlier studies and which has been shown to solve the overcooling problem that has long plagued hydrodynamical simulations and to lead to an excellent match to both the optical and X-ray properties of groups of galaxies (McCarthy et al., 2010, 2011). Outflows driven by AGN strongly increase the scale out to which baryons modify the power spectrum. We also investigate how the power is distributed over different components (i.e. CDM, gas and stars) and examine the back-reaction of the baryons on the dark matter. In a follow-up paper (Semboloni et al., 2011), we quantify the implications for current and proposed weak lensing surveys and we show how the uncertainty due to baryonic physics can be reduced by making use of additional observations of groups and clusters. This chapter is organised as follows. In §2.2 we discuss the simulations and the power spectrum estimator employed. In our main results section, §2.3, we compare our dark matter only simulation to analytical estimates (§2.3.1), we compare power spectra of simulations with different baryonic processes (§2.3.2), and we investigate how the power is distributed over different physical components (§2.3.3). In this section we also examine the back-reaction of galaxy formation on the dark matter (§2.3.4) and we consider the evolution of the most dominant effects on the power spectrum (§2.3.5). We compare to the results found by other authors in §2.4 and provide a summary in §2.5. Finally, we test the convergence of our results in Appendix A and provide tables of the power spectra of all simulations in Appendix B. We note that all distances quoted in this chapter are comoving and all power 17 Galaxy formation and the matter power spectrum spectra are obtained at redshift zero, unless stated otherwise. 2.2 Simulations The OWLS project (Schaye et al., 2010), where OWLS is an acronym for OverWhelmingly Large Simulations, is a suite of large, cosmological, hydrodynamical simulations. The code used is a heavily extended version of gadget iii, a Lagrangian code which was last described in Springel (2005a). It uses a TreePM algorithm to efficiently calculate the gravitational forces (where PM stands for Particle Mesh and the “Tree” describes the structure in which the particles are organised for this calculation, see for example Barnes & Hut, 1986; Xu, 1995; Bagla, 2002) and Smoothed Particle Hydrodynamics (SPH) to follow and evolve the gas particles (see Rosswog, 2009, for a review). There are two main sets of simulations, which have periodic boxes of size L = 25 and 100 h−1 comoving Mpc on a side, and are run down to redshifts z = 2 and 0, respectively. Most simulations use 5123 collisionless cold dark matter (CDM) particles and an equal number of baryonic (collisional gas or collisionless star) particles. We will refer to the particle number used in a simulation with the 1/3 parameter N = Npart (= 512 for the high-resolution simulations). In this work we will focus on z = 0 and hence on the simulations using a 100 h−1 Mpc box. The particle masses are 4.06 × 108 h−1 M( [L/(100 h−1 Mpc)]3 [N/512]−3 for the dark matter and 8.66 × 107 h−1 M( [L/(100 h−1 Mpc)]3 [N/512]−3 for the baryons. The gravitational forces are softened on a comoving scale of 1/25 of the initial mean interparticle spacing, L/N , but the softening length is limited to a maximum physical scale of 2 h−1 kpc[L/(100 h−1 Mpc)] which is reached at z = 2.91. The SPH calculations use 48 neighbours. For the initial conditions, a theoretical matter power spectrum – which of course depends on the chosen set of cosmological parameters – is generated using cmbfast (Seljak & Zaldarriaga, 1996, version 4.1). Prior to imposing the linear input spectrum, the particles are set up in an initially glass-like state, as described in White (1994). The particles are then evolved to redshift z = 127 using the Zel’dovich (1970) approximation. On small scales, the physics of galaxy formation is unresolved, and subgrid models are needed to include baryonic effects like radiative cooling, star formation and supernova feedback. Although each OWLS run is a state-of-the-art cosmological simulation in itself, the real power of the OWLS project lies in the fact that it is composed of more than 50 simulations that all incorporate different sets of physical processes, parameter values, or subgrid recipes. In this way the effects of turning off or tweaking a single process can be studied in detail, making it especially well-suited to investigate which processes can, by themselves, change the power at k ∼ 1 − 10 h Mpc−1 by > 1%. In this chapter we briefly describe the subgrid physics included in the reference simulation, as well as the differences with respect to simulations we compare to in §2.3.2. For a more detailed treatment of 18 2.2.1 The reference simulation the simulations and the different physics models included, we refer to Schaye et al. (2010). 2.2.1 The reference simulation As the intention of the OWLS project is to investigate the effects of altering or adding a single physical process, it is convenient to have a single simulation that acts as the basis for all other simulations. Such a “default” simulation should of course include many of the physical processes that we know to be important already, as ideally we would only want to vary one thing at a time. We call this simulation the reference simulation, or REF for short. Note that this is not intended to be the “best” simulation, but simply a model to build on. In fact, it has for example been shown that AGN feedback, which was not included in the REF model and which we briefly discuss in the next section, is required to match observations of groups and clusters of galaxies (McCarthy et al., 2010, 2011). We assume cosmological parameter values derived from the Wilkinson Microwave Anisotropy Probe (WMAP) 3-year results (Spergel et al., 2007): {Ωm , Ωb , ΩΛ , σ8 , ns , h} = {0.238, 0.0418, 0.762, 0.74, 0.951, 0.73}. Except for σ8 , all of these are consistent with the WMAP 7-year data (Komatsu et al., 2011). This specific parameter describes the root mean square fluctuation in spheres with a radius of 8 h−1 Mpc linearly extrapolated to z = 0 and effectively normalises the matter power spectrum. Measurements in the last few years have systematically increased the value of σ8 , which may influence the validity of our results. To check the effects of using “wrong” values for this and other cosmological parameters, we have re-run our two most important simulations – one with only dark matter and one in which AGN feedback is added to the reference model – using the WMAP7 cosmology. We briefly discuss these at the end of section §2.2.2. As we shall see in §2.3.5, this change in cosmology does not affect our conclusions. The reference simulation includes both radiative cooling and heating, which are modelled using the prescription of Wiersma, Schaye & Smith (2009). Net radiative cooling rates are computed on an element-by-element basis in the presence of the cosmic microwave background and the Haardt & Madau (2001) model for the UV and X-ray background radiation from quasars and galaxies, taking into account the contributions of eleven different elements pre-computed using the publicly available photo-ionization package CLOUDY, last described by Ferland et al. (1998). The effects of hydrogen ionization are modelled by switching on the Haardt & Madau (2001) model at z = 9. Cosmological simulations do not yet come close to resolving the process of star formation, and so a subgrid recipe has to be included for this as well. In our simulations, gas particles can be converted into star particles once their hydrogen number densities exceed the threshold for thermo-gravitational instability (n∗H = 0.1 cm−3 ; Schaye, 2004). Cold gas particles with higher densities follow an imposed equation of state, P ∝ ργeff . Here γeff = 4/3, for which Schaye & Dalla Vecchia (2008) showed that both the Jeans mass and the ratio of the Jeans length to the 19 Galaxy formation and the matter power spectrum Simulation Description AGN Includes AGN (in addition to SN feedback) AGN−WMAP7 Same as AGN, but with a WMAP7 cosmology DBLIMFV1618 Top-heavy IMF at high pressure, extra SN energy in wind velocity DMONLY No baryons, cold dark matter only DMONLY−WMAP7 Same as DMONLY, but with a WMAP7 cosmology MILL Millennium simulation cosmology (i.e. WMAP1), η = 4 (twice the SN energy of REF ) NOSN No SN energy feedback NOSN−NOZCOOL No SN energy feedback and cooling assumes primordial abundances NOZCOOL Cooling assumes primordial abundances WDENS Wind mass loading and velocity depend on gas density (SN energy as REF ) WML1V848 Wind mass loading η = 1, velocity vw = 848 km s−1 (SN energy as REF ) WML4 Wind mass loading η = 4 (twice the SN energy of REF ) Table 2.1: The different variations on the reference simulation that are compared in this chapter. Unless noted otherwise, all simulations use a set of cosmological parameters derived from the WMAP3 results and use identical initial conditions. SPH kernel are independent of the density, thus preventing spurious fragmentation due to a lack of numerical resolution. Using their pressure-dependent prescription for star formation, the observed Kennicutt-Schmidt relation, a surface density scaling law for the star formation rate that can be written as Σ̇∗ ∝ Σng (Kennicutt, 1998), is reproduced by construction, independent of the imposed equation of state. The reference simulation assumes a Chabrier (2003) stellar Initial Mass Function (IMF) with low and high mass cut-offs at 0.1 and 100 M(, respectively. The release of hydrogen, helium and heavier elements by these stars to the surrounding gas is tracked as well: gas can be ejected through Type II SNe and stellar winds for massive stars, and Type Ia SNe and Asymptotic Giant Branch (AGB) stars for intermediate mass stars. This implementation of stellar evolution and chemical enrichment is discussed in Wiersma et al. (2009). Finally, the reference simulation includes a prescription for supernova feedback, discussed in Dalla Vecchia & Schaye (2008). Supernovae are capable of depositing a significant amount of energy in the surrounding gas, driving large-scale winds that may eject large amounts of gas, dramatically suppressing the formation of stars. In the model used here, the energy from SNe is injected into the gas kinetically. After a delay time of 30 Myr, a new star particle j will “kick” a neighbouring %Nngb SPH particle i with a probability ηmj / i=1 mi in a random direction, giving it an extra velocity vw . The reference simulation uses the values η = 2 for the initial wind mass loading and vw = 600 km s−1 for the initial wind velocity, which corresponds to 40% of the available kinetic energy for our IMF. 20 2.2.2 Other models 2.2.2 Other models The OWLS project includes many variations on the reference simulation. We will now briefly discuss the simulations that we compare to in §2.3.2. The different models are listed in Table 2.1. For more details and other models we again refer to Schaye et al. (2010). The simulation DMONLY includes only dark matter, hence the only active physical process is gravity. This model is useful, as many (semi-)analytical models for the matter power spectrum assume that baryons are unimportant on large scales. The NOSN simulation excludes supernova feedback, and the simulation NOZCOOL assumes primordial abundances when computing cooling rates. The simulation NOSN−NOZCOOL excludes both SN feedback and metal-line cooling. Naturally, none of the three simulations can be considered realistic as we know that the omitted processes exist, but they are valuable tools to investigate on what scales and in what measure these processes affect the total matter power spectrum. In fact, the same may be said for the other models we consider as all, except for AGN, suffer from the overcooling problem and hence apparently miss an important process that does occur in nature (be it AGN feedback or something else). Supernova feedback models suffer from large uncertainties due to the limited resolution of the simulations and a lack of observational constraints. Though the product of the initial wind mass loading and the initial wind velocity squared, 2 ηvw , determines the energy injected into the winds per unit stellar mass and is therefore limited from above by the energy available from the SNe, the individual parameters are poorly constrained and can thus be varied. One variation on the reference model that uses the same SN energy per unit stellar mass as REF is WML1V848, in which the wind mass loading √ is reduced by a factor of 2 while the wind velocity is increased by a factor of 2. Another such variation is WDENS, in which the wind parameters scale with the density of the gas from which the 1/6 star particle formed: the wind velocity as vw ∝ nH , and the wind mass loading −1/3 −2 ∝ nH . Both parameters are equal to their fiducial values for stars as η ∝ vw formed at the density threshold for star formation. For the polytropic EoS that we impose onto the ISM, the wind velocity in this model scales with the local effective sound speed, as might be the case for thermally driven winds. We also compare to models where the SN energy is varied. One scenario in which the SN energy may be higher than that in the reference model is when, under certain circumstances, the IMF becomes top-heavy, meaning that relatively more high-mass stars are produced. It is expected that the IMF is top-heavy at high redshift and low metallicity (e.g. Larson, 1998), and both observations and theory suggest that it may be top-heavy in extreme environments like the galactic centre and starburst galaxies (e.g. Baugh et al., 2005; Bartko et al., 2010). In the simulation DBLIMFV1618, the latter effect is modelled by a switch from the Chabrier IMF to one that follows φ(m) ∝ m−1 once the gas reaches a certain 21 Galaxy formation and the matter power spectrum pressure threshold, which is set so that ∼ 10% of the stellar mass forms with a top-heavy IMF. In this case, the emissivity in ionizing photons goes up by a factor 7.3, and it is assumed that the SN energy scales up by the same factor. In the model √ we consider here, this extra energy is used to raise the wind velocity by a factor 7.3. The final model that we consider that only differs from REF in terms of its wind parameters is WML4, in which the SN energy per unit stellar mass is doubled by simply increasing the wind mass loading by a factor of two. The same is done in the simulation MILL. However, the most important feature of the latter is that it uses the same values for the cosmological parameters as the Millennium simulation (Springel, Di Matteo & Hernquist, 2005). These are derived from first-year WMAP data and are given by: {Ωm , Ωb , ΩΛ , σ8 , ns , h} = {0.25, 0.045, 0.75, 0.9, 1.0, 0.73}. The last and, for our purposes, most important physics variation we consider here adds a phenomenon that has proved to be increasingly necessary to reconcile theory and observations, from the scales of individual galaxies to clusters: Active Galactic Nuclei, or AGN. They are caused by the emission of large amounts of energy from the accreting supermassive black holes that reside at the centres of galaxies, in the form of radiation that may couple to the gas and relativistic jets caused by the magnetic field of the infalling material, which can heat and displace gas out to very large distances. AGN have been invoked to explain, for example, the low star formation rates of high-mass galaxies and the suppression of cooling flows in clusters. Moreover, Levine & Gnedin (2006) have used a toy model to demonstrate that AGN feedback may provide sufficient energy to have a large effect on the matter power spectrum. We model the growth of supermassive black holes and the associated feedback processes using the prescription detailed in Booth & Schaye (2009), which is an extension of that by Springel et al. (2005). During the simulation, a black hole seed particle with mass mseed = 9 × 104 h−1 M( (i.e. 10−3 mbaryon ) is placed at the centre of every dark matter halo whose mass exceeds mhalo,min = 4 × 1010 h−1 M( (corresponding to 102 dark matter particles). These particles then accumulate mass from the surrounding gas at an (Eddington-limited) rate based on BondiHoyle-Lyttleton accretion (Bondi & Hoyle, 1944; Hoyle & Lyttleton, 1939), but scaled up by a factor α to account for the lack of a cold gas phase and the finite numerical resolution. However, for densities below our star formation threshold we do not expect a cold phase to be present and we therefore set α equal to unity. To ensure a smooth transition, α is made to depend on the density of the gas: & 1 if nH < n∗H ' (β α= (2.1) nH otherwise. n∗ H Here the density threshold n∗H is the critical value required for the formation of a cold interstellar gas phase (n∗H = 0.1 cm−3 ; see §2.2.1). Models of this type are called ‘constant-β models’, and the fiducial value β = 2 is used throughout this chapter. 22 2.2.3 Power spectrum calculation The black holes inject 1.5 per cent of the rest mass energy of the accreted gas into the surrounding matter in the form of heat. This feedback efficiency determines the normalisation, but not the slope, of the relations between black hole mass and galaxy properties. Booth & Schaye (2009) and Booth & Schaye (2011) demonstrate that this efficiency reproduces the observed relations between BH mass and both stellar mass and stellar velocity dispersion, as well as their evolution. McCarthy et al. (2010) have shown that the AGN simulation, but not the reference model, provides excellent agreement with both optical and X-ray observables of groups of galaxies at redshift zero. In particular, it reproduces the temperature, entropy, and metallicity profiles of the gas, the stellar masses, star formation rates, and age distributions of the central galaxies, and the relations between X-ray luminosity and both temperature and mass. We therefore consider simulation AGN to be more realistic than our other models. As we shall see in §2.3, the inclusion of AGN feedback greatly affects the power spectrum on a large range of scales. Finally, we have re-run two simulations, DMONLY and AGN, with cosmological parameters derived from the WMAP 7-year results (Komatsu et al., 2011): {Ωm , Ωb , ΩΛ , σ8 , ns , h} = {0.272, 0.0455, 0.728, 0.81, 0.967, 0.704}. These versions are called DMONLY−WMAP7 and AGN−WMAP7, respectively. We consider the latter to be our most realistic and up-to-date model. Note that the linear input power spectra used for the initial conditions of these simulations have not been generated by cmbfast, but by the more up-to-date f90 package camb (Lewis & Challinor, 2002, version January 2010). 2.2.3 Power spectrum calculation The distribution of matter in the Universe can be described by a continuous density function, ρ(x), where the vector x specifies the position relative to some arbitrary origin. Given this density field, we consider fluctuations, δ(x), defined as: δ(x) ≡ ρ(x) − ρ̄ , ρ̄ (2.2) where ρ̄ is the global mean density. We can relate this density contrast to wave modes δ̂k via a discrete Fourier transform: ! (2.3) δ̂k e−ik·x . δ(x) = k The density field can thus be seen as made up of waves with certain amplitudes and phases, with wave vectors k. We now define the power spectrum, P (k), as: * ) (2.4) P (k) ≡ V |δ̂k |2 , k where V is the volume under consideration. The power spectrum is therefore obtained by collecting the amplitudes-squared of all wave modes with the same 23 Galaxy formation and the matter power spectrum wave number k = |k|, and averaging them. This makes it clear that the power spectrum is a statistical tool, whose accuracy increases when more waves of the same length are available (i.e. when the scale 2π/k is small compared to the size of the box). We will present our results using what is often called the dimensionless matter power spectrum, which is defined as: ∆2 (k) = k3 P (k). 2π 2 (2.5) The dimensionless power spectrum scales with the mass variance, σ 2 (M ), where + 2π ,3 ρ̄. Note that using ∆2 (k) instead of P (k) does not affect the relative M = 4π 3 k differences between power spectra. The code we have chosen to use to obtain accurate power spectrum estimations from our simulations is the publicly available f90 package called powmes (Colombi et al., 2009). The advantages of powmes stem from the use of the Fourier-Taylor transform, which allows analytical control of the biases introduced, and the use of foldings of the particle distribution, which allow the dynamic range to be extended to arbitrarily high wave numbers while keeping the statistical errors bounded. For a full description of these methods we refer to Colombi et al. (2009). We have compared the performance of powmes with respect to power spectrum estimators using simple NGP, CIC and TSC interpolation schemes, and found that powmes is capable of obtaining far more accurate power spectra over a larger spectral range within the same computation time. We have expanded powmes with the possibility to consider only one group of particles at a time, in order to see which parts of the power spectrum are dominated by the contribution of, for example, cold dark matter (see §2.3.3). Finally, we performed extensive timing tests using different grids, foldings, CPUs and particle numbers which, combined with the performance results from Colombi et al. (2009), resulted in the fiducial values G = 256 and F = 7 for the number of grid points on a side and the number of foldings, respectively. 2.2.3.1 Discreteness and other numerical limitations In Appendix A we demonstrate that the simulations are sufficiently converged with respect to increases in the numerical resolution to predict the power spectrum with better than 1% accuracy for k ! 10 h Mpc−1 . This range is greatly expanded in both directions if we only consider the relative differences in power between simulations. Besides numerical resolution, the predicted power spectra may be affected by sample variance, which is generally called cosmic variance in cosmology. This is caused by the finite volume of the box and by the fact that each simulation provides only a single realisation of the underlying statistical distribution. Note that finite volume effects are different for the simulations than for observational surveys, because the mean density in the simulation boxes is always equal to the cosmic mean. In Appendix A we show that finite volume effects may cause us to 24 2.3 Results underestimate the effects of baryons on scales of several tens of Mpc, i.e. close to the size of the box. While the fact that we only use a single realisation of the initial conditions prevents us from obtaining highly accurate absolute values for the power spectrum on scales close to the box size, it does not prevent us from investigating the relative changes in power caused by baryon physics. Finally, we are limited in our determination of the power spectrum by the discreteness of the density field. Because we use particles to represent a continuous field, there will always be non-zero power present at all scales, called white noise or shot noise. If we assume the particle distribution to be a local Poisson realisation of a stationary random field, an assumption used in any calculation of the power spectrum and one that is expected to be valid for an evolved distribution,5 this white noise component can be calculated (see, for example, Peebles, 1980, 1993; Colombi et al., 2009). Subtracting the shot noise from the initial estimate of the power spectrum will make the latter somewhat more accurate, but one should still expect the uncertainty on the estimate of the power spectrum to increase dramatically when the intrinsic power spectrum falls far below the shot noise level. The contribution of shot noise to P (k) is independent of k. Hence, if we use ∆2 (k) as the measure of the power spectrum, then the shot noise level will scale as k 3 . In the following section, the scale at which the shot noise of each simulation is equal to (the white noise corrected) ∆2 (k) is denoted by a circle, while it is shown explicitly in Appendix A. Note that the theoretical shot noise level has been subtracted for all power spectra shown in this chapter. 2.3 Results In this section we present the power spectra obtained from our simulations. In §2.3.1 we compare the power spectrum of our dark matter only simulation to predictions from the literature. We investigate the effects of adding or modifying prescriptions for baryonic processes in §2.3.2. We examine how well CDM, gas, and stars trace each other and consider the contributions of these different components to the total power in the reference simulation in §2.3.3, and we examine the backreaction of baryons on the CDM for the two most important simulations, REF and AGN, in §2.3.4. Finally, in §2.3.5, we take a closer look at model AGN, which we consider to be our most realistic simulation because it reproduces the optical and X-ray observations of groups of galaxies (McCarthy et al., 2010). We investigate the effect of using the WMAP7 rather than the WMAP3 cosmology, compare to widely used model power spectra, and consider the evolution of the effect of baryons on the matter power spectrum. 5 The discreteness noise can initially be much smaller if the particles are arranged on a grid or in a “glass-like” fashion. Particles in low-density regions may retain memory of their initial distribution, reducing the noise below the level expected for a Poisson distribution. 25 Galaxy formation and the matter power spectrum 2.3.1 Comparison of a dark matter only simulation to models In this section we compare the power spectrum of our DMONLY simulation to those predicted by the widely used models of Peacock & Dodds (1996, hereafter PD96) and Smith et al. (2003a, hereafter HALOFIT). The PD96 model is an extension of what is known as the HKLM model (Hamilton et al., 1991), which first introduced a universal analytical formula to map the linear correlation function into a non-linear one, the coefficients of which were estimated using N-body simulations. Both of these models assume spherical collapse of fluctuations that have reached a certain overdensity, followed by stable clustering (which states that the mean physical separation of particles is constant on sufficiently small scales). Peacock & Dodds (1994), followed by PD96, expanded on the groundwork laid by HKLM by presenting a version of the method that worked with power spectra instead and allowed for Ω *= 1, a non-zero cosmological constant and large negative spectral indices. However, numerical simulations have shown that the assumption of stable clustering is not always valid. The more recent HALOFIT model by Smith et al. (2003a) aimed to improve on PD96 by taking this into account. This method is based on concepts from the “halo model”, in which the density field is viewed as a distribution of isolated haloes (e.g. Peacock & Smith, 2000; Seljak, 2000; Cooray & Sheth, 2002). It is then assumed that the power spectrum can be split into two parts: a large-scale quasi-linear term that is due to the clustering of separate haloes (the 2-halo term), and a small-scale term caused by the correlation of subhaloes within the same parent halo (the 1-halo term). Their resulting analytical formulae were fit to power spectra obtained from N-body simulations. To create power spectra that conform with these models and the cosmological parameters used in our simulations, we have utilised the publicly available package iCosmo, described in Refregier et al. (2011). We chose to generate the linear power spectra using the Eisenstein & Hu (1999, EH) transfer function. We have also tried using the Bardeen et al. (1986, BBKS) transfer function to generate initial conditions for the PD96 model, as this is the one originally used by the authors, which introduced only minor differences with respect to the results shown here (1 − 10% lower power for k " 10 h Mpc−1 ). In Figure 2.1 we compare these models to the simulation that, like the theoretical models for non-linear growth, includes only dark matter (DMONLY ). For reference, the dashed curve shows the linear input power spectrum of the simulations. The bottom panel shows the ratio of the analytical predictions to our results. Note that we have omitted the first wave mode (at λ = 100 h−1 Mpc) in all of our figures because we cannot sample the power spectrum on the scale of the simulation box. We see that the dark matter power spectrum follows the analytical predictions pretty well on large scales (except on the scale of the simulation box), and that HALOFIT provides a better match than the PD96 model, as expected. However, on scales below a few Mpc the theoretical models start to 26 2.3.1 Comparison of a dark matter only simulation to models Figure 2.1: Comparison of the matter power spectrum of DMONLY−L100N512 with analytical fits by Peacock & Dodds (1996, PD96) and Smith et al. (2003a, HALOFIT) at redshift zero. The small circle, drawn in this and all following plots showing ∆2 (k), indicates the scale below which the (subtracted) shot noise in the simulation becomes significant, and the dashed purple curve shows the linear input power spectrum of the simulations. The bottom panel shows the ratios of the power spectra from theoretical models and the simulation. There is good agreement down to scales of a few Mpc, especially for the more recent HALOFIT model, but on smaller scales DMONLY predicts up to twice as much power as HALOFIT. For λ < 102 h−1 kpc the power in the DMONLY simulation drops due to a lack of resolution. severely underestimate the amount of structure formed in the simulation, and the difference between HALOFIT and the DMONLY simulation reaches a factor of 2 on scales of 1 − 3 × 10−1 h−1 Mpc. The rapid decline of the DMONLY power spectrum for k " 100 h Mpc−1 (λ < 102 h−1 kpc) is mostly due to the underproduction of low-mass haloes due to the finite resolution (see Appendix A). While we will always show the power spectrum up to k ≈ 500 h Mpc−1 , we are mainly interested in the scales relevant for upcoming surveys, k ! 10 h Mpc−1 . As discussed in Appendix A, for k + 10 h Mpc−1 numerical convergence may become an issue. Note that the power spectrum of the simulation remains reasonably well-behaved far below the theoretical shot noise level (i.e. well to the right of the small circle), indicating that the subtraction of this noise component is fairly accurate. Newer implementations of the halo model exist, based on fits to more recent N-body simulations. These models improve on the HALOFIT model by including a variable concentration-mass relation (such as those derived by Neto et al., 2007; 27 Galaxy formation and the matter power spectrum Figure 2.2: A comparison of the total matter power spectra of DMONLY−L100N512 (black), REF−L100N512 (green) and AGN−L100N512 (red), at redshift z = 0. The bottom panel shows the absolute value of the relative difference of the latter two with respect to DMONLY ; solid (dashed) curves indicate that the power is higher (lower) than for DMONLY. The dotted, horizontal line shows the 1% level. Note that the first wave mode has been omitted as it holds no information. While pressure forces smooth the baryonic density field on intermediate scales, cooling allows the baryons to increase the total power on small scales. The addition of AGN feedback, which is required to match observations of groups, has an enormous effect, reducing the power by " 10% for k " 1 h Mpc−1 . Duffy et al., 2008) and have been shown to reproduce the power spectra from simulations with higher accuracy (e.g. Hilbert et al., 2009). Since no suitable codes employing these models were available, we do not compare to their results here. However, as Hilbert et al. (2009) have shown that using the halo model with the concentration-mass relation of Neto et al. (2007) increases the power at intermediate scales, we suspect that such models would provide a better match to the power spectrum of DMONLY. 2.3.2 The relative effects of different baryonic processes In this section we present our main results, demonstrating how single baryonic processes, or implementations thereof, can influence the matter power spectrum. While we will focus mainly on the range of scales relevant to upcoming weak lensing surveys, 0.1 < k < 10 h Mpc−1 , we will also discuss the differences at the 28 2.3.2 The relative effects of different baryonic processes much smaller scales that our simulations allow us to probe. We note again that all power spectra are taken from simulations with L = 100 h−1 Mpc and N = 512 at redshift zero, and that, unless stated otherwise, all simulations are evolved from the same initial conditions. We start by comparing the power spectra of DMONLY, the reference simulation (REF ) and AGN, in Figure 2.2. The panel at the bottom of most plots in this section shows the absolute value of the relative difference between power spectra. The dotted, horizontal line shows the 1% level: any differences above this level will thus affect the statistics of surveys that aim to measure the power spectrum to this accuracy. It is immediately clear from the comparison between DMONLY and REF that the contribution of the baryons is significant, decreasing the power by more than 1% for k ≈ 0.8 − 5 h Mpc−1 . This is because gas pressure smooths the density field relative to that expected from dark matter alone. On scales smaller than 1 h−1 Mpc (k " 6 h Mpc−1 ), the power in the reference simulation quickly rises far above that of the dark matter only simulation, because radiative cooling enables gas to cluster on smaller scales than the dark matter. These results confirm the findings of previous studies, at least qualitatively (e.g. Jing et al., 2006; Rudd, Zentner & Kravtsov, 2008; Guillet, Teyssier & Colombi, 2010a; Casarini et al., 2011a). However, when AGN feedback is included, the results change drastically. In this case, the reduction in power relative to DMONLY already reaches 1% for k ≈ 0.3 h Mpc−1 (λ ≈ 20 h−1 Mpc) and exceeds 10% for 2 ! k ! 50 h Mpc−1 . We thus see that AGN feedback even suppresses the total matter power spectrum on very large scales. The enormous effect of AGN feedback is due to the removal of gas from (groups of) galaxies. That large amounts of gas are indeed being moved to large radii in this simulation has been shown by, for example, Duffy et al. (2010, e.g. Figures 1 & 2) and McCarthy et al. (2011, e.g. Figure 3). Because the AGN reside in massive and thus strongly clustered objects, the power is suppressed out to scales that exceed the scale on which individual objects move the gas. Figure 2.3 shows the difference in the power spectra predicted by a variety of simulations relative to that predicted by the reference simulation. The models are listed in Table 2.1 and were described in §2.2.2. The top panel shows the effect of turning off SN feedback and/or metal-line cooling. Since SN feedback heats and ejects gas, we expect it to decrease the small-scale power. Indeed, the power in NOSN is > 1% higher than in the reference simulation for k > 4 h Mpc−1 and the difference reaches 10% at k ≈ 10 h Mpc−1 . The absence of SN feedback also increases the star formation rate, making stars the dominant contributor to the total matter power spectrum out to larger scales (not shown). Turning off metal-line cooling reduces the power on small scales because less gas is able to cool down and accrete onto galaxies. Indeed, model NOZCOOL predicts 10 − 50% less power for k " 30 h Mpc−1 . However, the absence of metalline cooling increases the power by several percent for λ ∼ 1 h−1 Mpc because the lower cooling rates force more gas to remain at large distances from the halo 29 Galaxy formation and the matter power spectrum Figure 2.3: Comparisons of z = 0 power spectra predicted by simulations incorporating different physical processes to that predicted by the reference simulation. The panels are similar to the bottom panel of Figure 2.2, but now show differences relative to REF. The thin black curve that is repeated in all panels shows the relative difference with DMONLY. Colours indicate different simulations, while different line styles indicate whether the power is reduced or increased relative to the reference simulation. Top: A simulation without SN feedback (blue), one without metal-line cooling (green) and one that excludes both effects (red). SN feedback decreases the power on all scales. Metal-line cooling decreases the power for λ > 0.4 h−1 Mpc but increases the power on smaller scales. The effects of removing both SN feedback and metal-line cooling are > 10% for k > 20 h Mpc−1 and > 1% for k > 2 h Mpc−1 . Middle: Different SN wind models which all use the same amount of SN energy per unit stellar mass (see text). The effects of varying the implementation of SN feedback, while keeping the SN energy that is injected per unit stellar mass the same, are > 10% for k > 10 h Mpc−1 and > 1% for k > 1 h Mpc−1 . Bottom: Models with different feedback energies and processes, see text for details. Including a top-heavy IMF at high pressure (DBLIMFV1618 ) or AGN feedback (AGN ) greatly reduces the power. The reduction caused by the latter is > 10% for k > 2 h Mpc−1 and > 1% for k > 0.4 h Mpc−1 . 30 2.3.2 The relative effects of different baryonic processes centres. Even though the effects of SN feedback and metal-line cooling are somewhat opposite in nature, as the former increases the energy of the gas while the latter allows the gas to radiate more of its thermal energy away, removing both processes in the simulation NOSN−NOZCOOL still introduces differences of about 1 − 10% for k " 2 h Mpc−1 relative to the reference simulation. It is therefore vital to take both SN feedback and metal-line cooling into account if one wants to predict the matter power spectrum with an accuracy better than 10%. We compare models that use different prescriptions for SN feedback, but the same amount of SN energy per unit stellar mass as REF, in the middle panel of Figure 2.3. In WML1V848 the SN energy √ is distributed over half as much gas, but the initial wind velocity is a factor 2 higher, resulting in more effective SN feedback in all but the lowest mass galaxies. The differences with respect to the reference model extend to even larger scales than when SN feedback is removed entirely: the power is reduced by > 1% for k " 1 h Mpc−1 and by " 10% for k " 10 h Mpc−1 . In model WDENS the initial wind velocity increases with the local sound speed in the ISM, but the mass loading is adjusted so as to keep the amount of SN energy per unit stellar mass equal to that in REF. This implementation results in an even stronger decrease in power on scales < 10 h−1 Mpc. In both these models, the reduction in power is caused by the increased effectiveness of SN feedback in driving outflows of gas. We stress that because of our lack of understanding of the effects of SN feedback, there is a priori no reason to assume that the model used in the reference simulation is a better approximation to reality than the models we compare to here. In fact, it is possible that the SN energy per unit stellar mass is different from the value assumed in the REF model, or that it varies with environment. Model DBLIMFV1618, which we compare with REF in the bottom panel of Figure 2.3, uses a top-heavy IMF in high-pressure environments. Such an IMF yields more SNe per unit stellar mass which decreases the power by > 1% for k > 0.7 h Mpc−1 and by > 10% for k > 4 h Mpc−1 . Clearly, it will be necessary to understand any environmental dependence of the IMF in order to predict the matter power spectrum to 1% accuracy on the scales relevant for upcoming surveys. On the other hand, doubling the wind mass loading, while keeping the wind velocity fixed to the value used in REF, as is done in WML4, has a far more modest effect. This is because the wind velocities are too low to significantly disturb the high-pressure ISM of massive galaxies. The differences with respect to the reference model are limited to ! 1% for k ! 10 h Mpc−1 . The bottom panel of Figure 2.3 also compares the reference simulation to model AGN, which differs from REF by the inclusion of a phenomenon that has been shown to play a role in many contexts and that strongly improves the agreement with observations of groups of galaxies (McCarthy et al., 2010). Like SN feedback, AGN feedback decreases the power by heating and ejecting gas, but the effect is more dramatic than that of the standard SN feedback model, both in scope and magnitude. With respect to the reference model, the power is decreased by " 30% 31 Galaxy formation and the matter power spectrum Figure 2.4: Difference of the z = 0 matter power spectrum in a simulation using a WMAP1 cosmology (MILL) relative to that of the REF model, which assumes the WMAP3 cosmology, after rescaling the former to match the latter on the scale of the simulation box (λ = 100 h−1 Mpc, not shown). WML4 is shown for reference as this simulation uses the same baryonic physics as MILL. For k " 3 h Mpc−1 , the effect of AGN feedback is at least as strong as that of this unrealistically large change in cosmology. for k > 10 h Mpc−1 and by " 5% for k > 1 h Mpc−1 . The reduction in power only falls below 1% for k < 0.4 h Mpc−1 (λ " 10 h−1 Mpc). Note that the effect of AGN feedback is strikingly similar to, albeit stronger than, that of the stellar feedback model that uses a top-heavy IMF in high-pressure environments. It is clear that many different baryonic processes, and even slightly different implementations thereof, are capable of introducing significant differences in the matter power spectrum on scales relevant for observational cosmology. To put the effects of baryons into perspective, we compare to a simulation with a very different cosmology, MILL, in Figure 2.4. The difference between the cosmology derived from the first-year WMAP data used in MILL and the one derived from the 3-year WMAP data used in the other simulations is large; in fact, the difference is much larger than the error bars of the most recent data allow. For reference, we note that the currently favoured cosmology (Komatsu et al., 2011) lies in between those given by WMAP1 and WMAP3. To account for the difference in normalisation of the MILL power spectrum, which is caused mainly by its higher Ωm and σ8 values, we have rescaled it to have the same power at the box size as REF. Still, the effect on the power spectrum exceeds 10% for k " 0.2 h Mpc−1 . A quick comparison with WML4, which uses the exact same baryon physics as MILL and twice the SN wind mass loading used in REF, shows that the effect of the change in mass loading is relatively small, as we had already shown in Figure 2.3. However, we 32 2.3.3 Contributions of dark matter, gas and stars see that for k " 3 h Mpc−1 , the effect of AGN feedback is at least as strong as that of this unrealistically large change in cosmology. We thus conclude that baryonic effects are not only significant at the ∼ 1% level, but can even be larger than a “very wrong” choice of cosmology. Almost all theoretical models used in the literature consider only CDM, assuming that the baryons follow the dark matter perfectly for k ! 1 h Mpc−1 . We have shown (see Fig. 2.2) that the fact that baryons experience gas pressure reduces the power on large scales, while their ability to radiate away their thermal energy increases the power on small scales. If we ignore AGN feedback, as has been done in all previous work, we find that the power is reduced by at least a few percent for 0.8 < k < 5 h Mpc−1 and that the power is increased for k > 7 h Mpc−1 , with the difference reaching approximately 6% at k = 10 h Mpc−1 for the reference model. However, the single process of AGN feedback, which improves the agreement with observations of groups of galaxies, reduces the power by " 10% over the whole range 1 ! k ! 10 h Mpc−1 and the reduction only drops below 1% for k < 0.3 h Mpc−1 . Highly efficient SN feedback, as may for example result from a top-heavy IMF in starbursts, would have nearly as large an effect. One can therefore not expect to constrain the primordial power spectrum more accurately until such processes are better understood and included in theoretical models. 2.3.3 Contributions of dark matter, gas and stars Generally, power spectra are calculated using all matter inside the computational volume. This total matter power spectrum is what is measurable using e.g. gravitational lensing surveys. However, as we have a larger freedom of measurement using simulations, we can also consider the power in different components, for example to see which parts of the power spectrum are dominated by baryonic matter or how baryons change the distribution of cold dark matter. On sufficiently large scales the baryons will trace the dark matter. Hence, when averaged over these scales, the baryonic and CDM densities are given by ρcdm = ρbar = Ωm − Ωb ρtot , Ωm Ωb ρtot . Ωm (2.6) We can now use these expressions to estimate the relative contributions of correlations between particle types to the total matter power spectrum. Using Ptot (k) ∝ ,|ρ̂tot (k)|2 - ∝ ,|ρ̂cdm|2 -+,ρ̂cdm ρ̂∗bar -+,ρ̂∗cdm ρ̂bar -+,|ρ̂bar |2 -, we find, for sufficiently 33 Galaxy formation and the matter power spectrum small k: Pcc = Pcb + Pbc = Pbb = (Ωm − Ωb )2 Ptot ≈ 0.68Ptot , Ω2m 2Ωb (Ωm − Ωb ) Ptot ≈ 0.29Ptot , Ω2m Ω2b Ptot ≈ 0.03Ptot . Ω2m (2.7) Hence, on large scales we expect the power due to the auto-correlation of CDM to dominate the total matter power spectrum, with a significant contribution from the cross terms Pcb and Pbc . The four panels of Figure 2.5 show power spectra for the REF−L100N512 (left) and AGN−L100N512 (right) simulation at z = 0, both for the total matter (solid black) and for individual components (coloured curves). For reference, we also show the power spectrum for DMONLY−L100N512 (dashed black). The top row shows the power spectra of δi ≡ (ρi − ρ̄i )/ρ̄i . This definition ensures that the power spectra of all components i converge on large scales, which allows us to examine how well different components trace each other. The bottom row, on the other hand, shows the power spectra of δi# ≡ (ρi − ρ̄tot )/ρ̄tot , which allows us to estimate the contributions of different components to the total matter power spectrum. Looking at the top-left panel, we see that, as expected, the baryonic components trace the dark matter well at the largest scales. However, significant differences exist for λ ! 10 h−1 Mpc. Observe that, at scales of several hundred kpc and smaller, the difference between REF and the dark matter only simulation is larger than that between the latter and the analytical models we compared to earlier (see Fig. 2.1). In fact, the difference between the cold dark matter component of the reference simulation and DMONLY is also larger than that between the latter and the analytic models. This is due to the back-reaction of the baryons on the dark matter, which we will discuss in §2.3.4. Next, we turn to the bottom-left panel of Figure 2.5 which shows that cold dark matter dominates the power spectrum on large scales, as expected, although the contribution from the CDM-baryon cross power spectrum (not shown) is important as well. The contribution of baryons is significant for λ ! 102 h−1 kpc and dominates below 60 h−1 kpc. The strong small-scale baryonic clustering is the direct consequence of gas cooling and galaxy formation. Taking a look at how the baryonic component is itself built up, we see that gas dominates the baryonic power spectrum on large scales, but that stars take over for λ < 1 h−1 Mpc. The gas power spectrum flattens for λ ! 1 h−1 Mpc, which corresponds to the virial radii of groups of galaxies, but steepens again for λ ! 0.1 h−1 Mpc, i.e. galaxy scales. The reason for the decrease in slope around 1 h−1 Mpc is threefold. First, the pressure of the hot gas smooths its distribution on the scales of groups and clusters of galaxies. Second, as the gas collapses it fragments and forms stars. Third, due to stellar feedback the gas is distributed out to large distances, reducing the power. 34 2.3.3 Contributions of dark matter, gas and stars Figure 2.5: Decomposing the z = 0 total power spectra (black) into the contributions from cold dark matter (blue), gas (green) and stars/black holes (red). The left and right columns show results for REF−L100N512 and AGN−L100N512. In the top row the density contrast of each component i is defined relative to its own mean density, i.e. δi ≡ (ρi − ρ̄i )/ρ̄i . This guarantees that all power spectra converge on large scales, thus enabling a straightforward comparison of their shapes. In the bottom row the density contrast of each component is defined relative to the total mean density, i.e. δi ≡ (ρi − ρ̄tot )/ρ̄tot , which allows one to compare their contributions to the total power. The power spectrum of the gas flattens or even decreases for λ ! 1 h−1 Mpc as a result of pressure smoothing, but its ability to cool allows it to increase again on galaxy scales (λ ! 102 h−1 kpc). The power spectrum of the stellar component, which is a product of the collapse of cooling gas, increases most rapidly towards smaller scales. While stars dominate the total power for λ % 102 h−1 kpc in REF, dark matter dominates on all scales when AGN feedback is included. The inclusion of AGN feedback greatly impacts the matter power spectrum on a wide range of scales. Comparing the top panels of Figure 2.5, we see that AGN feedback strongly decreases the power in the gas and stellar components relative to that of the dark matter for λ ! 1 h−1 Mpc. A comparison of the bottom panels reveals that the contribution of stars to the total power is reduced the most, with the reduction factor increasing from an order of magnitude on the largest scales to more than two orders of magnitude on the smallest scales. This clearly shows that AGN feedback suppresses star formation, as required to solve the overcooling problem. For the gas component the change is also dramatic. While ∆2gas (k) = 1 for λ ∼ 3 h−1 Mpc in REF, this level of gas power is only reached at 100 h−1 kpc for AGN. The suppression of baryonic structure by AGN feedback makes dark matter 35 Galaxy formation and the matter power spectrum the dominant component of the power spectrum on all scales shown, although it is important to note that the dark matter distribution is also significantly affected by the AGN, as we shall see next. 2.3.4 The back-reaction of baryons on the dark matter Even though dark matter is unable to cool through the emission of radiation, its distribution can still be altered by the inclusion of baryons due to changes in the gravitational potential. We examine this back-reaction of the baryons on the dark matter for the reference and AGN simulations in the left and right panels of Figure 2.6, respectively. In order to make a direct comparison, we have rescaled the density of the dark matter component of the simulations that include baryons by multiplying it by the factor Ωm /(Ωm − Ωb ). The blue curve shows the relative differences between the power spectrum of the rescaled CDM component and that of DMONLY. On scales k " 2 h Mpc−1 , corresponding to spatial scales λ ! 3 h−1 Mpc, the power in CDM structures in the reference simulation is increased by > 1% with respect to DMONLY. The difference continues to rise towards higher k, reaching 10% around k = 10 h Mpc−1 . Because the baryons can cool, they are able to collapse to very high densities, and in the process they steepen the potential wells of virialized dark matter haloes, causing these to contract. The effect is larger closer to the centres of these haloes, i.e. on smaller scales. The back-reaction is quite different when AGN feedback is included.6 The dark matter haloes still contract on small scales, albeit by a smaller amount, but the power in the dark matter component of the AGN simulation is decreased for scales > 200 h−1 kpc, corresponding to the sizes of haloes of L∗ galaxies. The reduction in the power of the CDM component in model AGN relative to DMONLY increases from roughly 1% at k = 3 h Mpc−1 to almost 10% around k = 10 h Mpc−1 . AGNdriven outflows redistribute gas to larger scales, which reduces the baryon fractions in haloes and results in shallower potential wells. This is consistent with the results of Duffy et al. (2010), who used the same simulation to show that AGN feedback decreases the concentrations of dark matter haloes of groups and clusters. Note, however, that because AGN can drive gas beyond the virial radii of their host haloes, their effect on the power spectrum cannot be fully captured by a simple rescaling of the halo concentrations. 2.3.5 A closer look at the effects of AGN feedback In this section we examine our most realistic model for the baryonic physics, AGN, more closely. 6 The small difference in power between the CDM component of AGN and DMONLY near the size of the box is most likely caused by errors in the power spectrum estimation. 36 2.3.5 A closer look at the effects of AGN feedback Figure 2.6: The back-reaction of baryons on the CDM. The blue curves show the relative difference between the power spectrum of the CDM component, after scaling the CDM density by the factor Ωm /(Ωm − Ωb ), and that of a dark matter only simulation for either the REF (top panel) or AGN (bottom panel) model. For comparison, the relative differences between the total matter power spectra of the baryonic simulations and DMONLY is shown by the black curves. Baryons increase the small-scale power in the CDM component. However, when AGN feedback is included, the power in the CDM component drops 1 − 10% below that of the DMONLY simulations for 0.2 ! λ ! 2 h−1 Mpc. 37 Galaxy formation and the matter power spectrum Figure 2.7: The dependence of the effect of AGN on cosmology. The curves show the relative differences between the z = 0 matter power spectra for models AGN and DMONLY for our fiducial WMAP3 cosmology (green) and for the WMAP7 cosmology (red). Changing the cosmology has little impact on the relative effect of the baryonic processes. 2.3.5.1 Dependence on cosmology Figure 2.7 shows how the relative difference between the z = 0 power spectra of models AGN−WMAP7 and DMONLY−WMAP7, both of which use the WMAP7 cosmology, compares to that between the same physical models in the WMAP3 cosmology (the latter case was already shown in Figure 2.2). Even though the power spectra are themselves strongly influenced by, for example, the much higher value of σ8 in the WMAP7 cosmology, the relative change in power due to baryons is nearly identical, at least so long as AGN feedback is included. This is good news for observational cosmology. It means that, once the large current scatter in implementations of subgrid physics has converged, it may be possible to separate the baryonic effects from the cosmological ones when modelling the matter power spectrum. It also means that we can assume that our results of the previous sections, which were based on the WMAP3 version of the AGN simulation, apply also to model AGN−WMAP7. 2.3.5.2 Evolution Next, we use the AGN−WMAP7 simulation, which we consider to be our most realistic model, to investigate the dependence of the effect of baryon physics on redshift. Figure 2.8 shows the relative difference between the power spectra of DMONLY−WMAP7 and AGN−WMAP7 at redshifts 3, 2, 1, 0.5 and zero. We see from this plot that on large scales, λ " 1 h−1 Mpc, the reduction in power due 38 2.4 Comparison with previous work Figure 2.8: Evolution of the relative difference between the matter power spectra of DMONLY−WMAP7 and AGN−WMAP7. From red to blue, redshift decreases from 3 to zero. The erratic behaviour of the z = 2 and z = 3 power spectra at the very smallest scales shown is due to a lack of resolution. For λ " 1 h−1 Mpc the reduction in power due to baryons evolves only weakly for z ! 1, but the transition from a decrease to an increase in power keeps moving to smaller scales. to the gas does not evolve much for z ! 1, although the differences between the different redshifts remain large compared with the precision of upcoming surveys. The weak evolution below z = 2 is consistent with McCarthy et al. (2011), who found that the expulsion of gas due to AGN feedback takes place primarily at 2 ! z ! 4. On scales below 1 h−1 Mpc, on the other hand, the effects of baryonic processes on the power spectrum keep increasing with time, with the transition point between a decrease and an increase in power steadily moving towards smaller scales. This is probably because the ejection of low-entropy halo gas at high redshift (z " 2) results in an increase of the entropy, and thus a reduction of the cooling rates, of hot halo gas at low redshift (McCarthy et al., 2011). 2.4 Comparison with previous work Our predictions for the effect of baryons on the matter power spectrum agree qualitatively with those of other authors, provided we restrict ourselves to including the same baryonic feedback processes as were considered in those studies. However, previous simulations did not include AGN feedback and hence suffered from overcooling.7 As we have demonstrated, AGN feedback (or very efficient stellar 7 The toy model of Levine & Gnedin (2006), which we briefly describe later in this section, did demonstrate, based purely on energetic grounds, that AGN feedback has the potential to have 39 Galaxy formation and the matter power spectrum feedback) has a dramatic effect on the matter power spectrum over a large range of scales. In this section we will consider both the qualitative and quantitative differences with respect to previous work, and examine how these may have come about. Jing et al. (2006) used gadget ii (Springel, 2005a) to run a simulation with a 100 h−1 Mpc box and 5123 gas and DM particles. Their simulation included radiative cooling and star formation, and used the Springel & Hernquist (2003) sub-grid model for the multiphase ISM and for galactic winds driven by star formation. Metal-line cooling and AGN feedback were not considered. They found that the power at k = 1 h Mpc−1 is reduced by ∼ 1% relative to a dark matter only simulation at z = 0, which matches our results for the reference simulation very well. Furthermore, in agreement with our reference model, they find that the inclusion of baryons increases the power by ∼ 10% at k = 10 h Mpc−1 . However, they find that the transition from a relative decrease to a relative increase in power occurs at k ≈ 2 h Mpc−1 , while we find that it lies at k ≈ 6 h Mpc−1 . As the simulation of Jing et al. (2006) excludes metal-line cooling, we expect their results to be in better agreement with our own results for NOZCOOL. The main difference with respect to the reference simulation turns out to be the position of the transition point from a relative decrease to a relative increase in power, which shifts to k ≈ 2 − 3 h Mpc−1 when metal-line cooling is turned off. Hence, using the simulation NOZCOOL, we reproduce both the qualitative and quantitative results of Jing et al. (2006), even though baryonic processes such as SN feedback are not implemented in the same way. Rudd, Zentner & Kravtsov (2008) used the art code (Kravtsov, 1999) expanded with the Eulerian hydrodynamics solver described in Kravtsov, Klypin & Hoffman (2002). They used a 60 h−1 Mpc box with 2563 particles, and included radiative cooling and heating, metal-line cooling, star formation, thermal SN feedback (which is described in Kravtsov, Nagai & Vikhlinin, 2005) and chemical enrichment. AGN feedback was not considered. The effect of the baryons on the matter power spectrum they found is far more dramatic than that found by Jing et al. (2006) and ourselves: a decrease in power of up to ∼ 10% relative to a dark matter only simulation for k < 1 h Mpc−1 , and a relative increase in power at k " 1 h Mpc−1 which already reaches ∼ 50% at k ≈ 5 h Mpc−1 . The reason for these large differences is unclear. Guillet, Teyssier & Colombi (2010a) used the MareNostrum simulation, which was run using the adaptive mesh refinement code ramses (Teyssier, 2002), to investigate the effects of baryons on both the variance and the skewness of the mass distribution. They used a 50 h−1 Mpc box with 10243 dark matter particles and included metal-dependent gas cooling, UV heating, star formation, SN feedback (using the kinetic feedback prescription of Dubois & Teyssier, 2008) and metal enrichment. AGN feedback was not considered. Unfortunately, they were not able to run their simulations down to z = 0, but quote results at redshift 2 instead. In a large effect on the matter power spectrum. 40 2.4 Comparison with previous work order to better compare to their results, we have examined the power spectra of REF−L100N512 and DMONLY−L100N512 at z = 2. In our reference simulation the scale on which baryons significantly reduce the power increases with time (note that AGN shows the opposite behaviour, see Fig. 2.8): in REF the 1% level is first reached at k ≈ 2 h Mpc−1 for z = 2 and at k ≈ 0.8 h Mpc−1 for z = 0. Meanwhile, the effect on the power on scales k " 10 h Mpc−1 hardly changes, and the transition scale from a decrease to an increase in power relative to DMONLY remains fixed at k ≈ 7 h Mpc−1 . Guillet, Teyssier & Colombi (2010a), on the other hand, do not detect a systematic decrease in power due to baryons at any scale. They find that the power is increased by 1% relative to a dark matter only simulation at k ≈ 3 h Mpc−1 , reaching 40% at k ≈ 10 h Mpc−1 . For our reference model we instead find a 2% decrease for k ≈ 3 h Mpc−1 and only a 6% increase at k ≈ 10 h Mpc−1 . It is hard to say why these results lie so far apart, and especially why the baryons in their simulation do not reduce the power on large scales due to pressure effects. We also compare to the recent study by Casarini et al. (2011a), who use the SPH code gasoline (Wadsley, Stadel & Quinn, 2004) to perform their simulations. They use two different volumes: a box of 64 h−1 Mpc on a side, and a much larger 256 h−1 Mpc box, both with only 2563 dark matter and an equal number of gas particles. Note that the mass resolution of their L = 64 h−1 Mpc run is comparable to that of our fiducial run, while the resolution of their L = 256 h−1 Mpc run is much poorer. They include radiative cooling, a UV background, star formation and SN feedback. For the latter they use the prescription of Stinson et al. (2006), in which Type II SNe are modelled using an analytical treatment of blastwaves combined with manually turning off radiative cooling. Metal-line cooling and AGN feedback were not considered. Using their 64 h−1 Mpc box, Casarini et al. (2011a) find an ∼ 1% decrease in power at k ≈ 1 − 2 h Mpc−1 and an increase in power at smaller scales, which reaches 20% at k ≈ 10 h Mpc−1 . These results are in reasonable agreement with both Jing et al. (2006) and our model NOZCOOL. However, when using their 256 h−1 Mpc box, they – like Guillet, Teyssier & Colombi (2010a) – find no decrease in power due to baryons at any scale, but instead a steady increase in power that reaches 1% at k ≈ 1 − 3 h Mpc−1 and 40% at k ≈ 10 h Mpc−1 . Finally, we discuss the work by Levine & Gnedin (2006), who used a toy model, rather than a hydrodynamic simulation, to evaluate the potential effect of AGN feedback on the matter power spectrum. In their models only the evolution of dark matter was followed explicitly. The gas was assumed to trace the dark matter at all scales and galaxy formation and the associated physical processes were not included. Their standard simulation volume is 64 h−1 Mpc on a side, and the simulation was run with resolutions of 1, 0.5 and 0.25 h−1 Mpc. We note that even their highest resolution is more than two orders of magnitude below the spatial resolution in our standard simulations. The gas was assumed to have a constant temperature of 1.5 × 104 K at all redshifts. A quasar luminosity function was used to determine the number of AGN at a given redshift and luminosity, which 41 Galaxy formation and the matter power spectrum were then each placed at a random location, although biased towards high-density regions. Of the AGN’s bolometric luminosity, a fraction +k = 1% was used to drive spherically symmetric outflows. Within these outflow regions the baryon fraction was assumed to be zero. After computing the power spectrum, they found a large discrepancy between simulations with different resolutions: when using a resolution of 1 h−1 Mpc, they found a reduction of roughly 10% in power for 0.3 ! k ! 3 h Mpc−1 at z = 0, relative to a simulation which did not include AGN, while their higher-resolution runs produced instead an increase in power at all scales, of up to 20%. We found a decrease in power of 1% at k ≈ 0.3 h Mpc−1 , reaching > 10% for 2 ! k ! 50 h Mpc−1 , which does not agree with their results, even in terms of the sign of the effect. Nevertheless, we do confirm the conclusion of Levine & Gnedin (2006) that AGN feedback can greatly affect the matter power spectrum on a wide range of scales. Even though our current understanding of galaxy formation still allows for significant deviations between studies, some qualitative results are the same: in the absence of AGN feedback, baryons will affect the matter power spectrum significantly on scales k ∼ 1 − 10 h Mpc−1 . Furthermore, all studies agree that the increase in power due to baryons is of the order of 10% at k = 10 h Mpc−1 . Jing et al. (2006), our reference and NOZCOOL models, and Casarini et al. (2011a) for their high-resolution simulation all predict a relative decrease in power of ∼ 1% at k ≈ 1 h Mpc−1 . Rudd, Zentner & Kravtsov (2008) also find a decrease in power due to baryons, but in their case the effect is far stronger than that of any other study, and is seen at much larger scales (k ! 1 h Mpc−1 ). However, like our reference simulation, all these simulations suffer from the well-known overcooling problem. As was demonstrated by McCarthy et al. (2010), the AGN simulation does not. We have shown that the inclusion of AGN has a tremendous effect on the matter power spectrum for λ ! 10 h−1 Mpc, both when compared to a simulation that includes only dark matter and when compared to simulations that include baryons and galaxy formation but not AGN feedback. Therefore, contrary to what, for example, Guillet, Teyssier & Colombi (2010a) claim, simulations that suffer from overcooling cannot be considered extreme models for which the effects of baryons on the total matter power spectrum are maximised. Instead, they are prone to under estimate the effects on large scales. Indeed, model AGN predicts a relative decrease in power of ∼ 1% already at k = 0.4 h Mpc−1 . The decrease in power reaches several tens of percent on scales k ∼ 1 − 10 h Mpc−1 , while simulations that suffer from overcooling instead predict a strong increase in at least part of this range. Based on our results and on the comparison to other studies, we argue that the inclusion of AGN in cosmological simulations is at present even more important than the improvement or convergence of existing prescriptions for other baryonic effects. Motivated by the results of Rudd, Zentner & Kravtsov (2008), Zentner, Rudd & Hu (2008) have proposed a method to account for the effects of galaxy formation on the matter power spectrum. This method assumes that the effects of baryons can be captured by a change in the halo concentration-mass relation. However, 42 2.5 Conclusions it is unlikely that such an approach can truly model the effects of baryons on the power spectrum. Since AGN-driven outflows significantly affect scales much larger than the sizes of individual haloes, the assumption made by Zentner, Rudd & Hu (2008) will certainly not be valid when AGN feedback is included. 2.5 Conclusions Upcoming weak lensing surveys, such as LSST, EUCLID, and WFIRST aim to measure the matter power spectrum with unprecedented accuracy. In order to fully exploit these observations, theoretical models are needed that can predict the nonlinear matter power spectrum at the level of 1% or better on scales corresponding to 0.1 ! k ! 10 h Mpc−1 . Here, we have employed a large suite of simulations from the OWLS project, as well as the highly accurate power spectrum estimator powmes, to investigate the effects of various baryonic processes on the matter power spectrum. These tools have also enabled us to examine the distribution of power over different mass components, the back-reaction of baryons on the CDM, and the evolution of the dominant effects on the matter power spectrum. Our most important finding is that the feedback processes that are required to solve the overcooling problem (i.e. the overproduction of stars), have a dramatic effect on the matter power spectrum. Such efficient feedback, most likely in the form of outflows driven by AGN, were not present in the simulations used in previous studies of the effects of baryons on the matter power spectrum (Jing et al., 2006; Rudd, Zentner & Kravtsov, 2008; Guillet, Teyssier & Colombi, 2010a; Casarini et al., 2011a). Although it was generally assumed that overcooling would make the simulations conservative, in the sense that they would overestimate the baryonic effects, we demonstrated that the opposite is true. The efficient outflows that are required to reproduce optical and X-ray observations of groups of galaxies, redistribute the gas on large scales, thereby reducing the total power by " 10% on scales k " 1 h Mpc−1 . We emphasise that the model from which we draw this conclusion, the simulation that includes AGN feedback, is not extreme. On the contrary, we consider it our most realistic model. McCarthy et al. (2010, 2011) showed that it provides excellent agreement with both optical and X-ray observables of groups of galaxies at redshift zero. In particular, it reproduces the temperature, entropy, and metallicity profiles of the gas, as well as the stellar masses, star formation rates, and age distributions of the central galaxies, and the relations between X-ray luminosity and both temperature and mass. We showed that metal-line cooling, star formation, and feedback from SNe all modify the matter power spectrum by > 1% on the scales relevant for upcoming surveys. In the absence of AGN feedback, the simulations with baryons have ∼ 1% less power relative to a dark matter only simulation on scales 0.8 ! k ! 6 h Mpc−1 (a consequence of gas pressure) and > 10% more power for k " 10 h Mpc−1 (a consequence of gas cooling). However, as we noted above, AGN feedback can 43 Galaxy formation and the matter power spectrum decrease the power for 1 ! k ! 10 h Mpc−1 by up to several tens of percent. Furthermore, some implementations of stellar feedback, e.g. the strong SN feedback resulting from a top-heavy stellar initial mass function in starbursts, can create differences of the same scope and magnitude by redistributing gas out to very large scales. The effects from such baryonic processes on the matter power spectrum can even exceed those of a very large change in cosmology (e.g. WMAP3 to WMAP1). Indeed, differences > 1% persist even up to scales as large as those corresponding to k ≈ 0.3 h Mpc−1 . In the absence of AGN feedback, the back-reaction of baryons on the dark matter increases the power in the CDM component by 1% at k ≈ 2 h Mpc−1 and the effect becomes larger towards smaller scales. However, when AGN are included they redistribute sufficiently large quantities of gas out to large radii to lower the power in the dark matter component by 1−10% for 3 ! k ! 30 h Mpc−1 . This is consistent with Duffy et al. (2010), who used the same simulation to show that AGN feedback decreases the concentrations of dark matter haloes of groups of galaxies. We stress, however, that the back-reaction of AGN feedback on the CDM will not be straightforward to implement in dark matter only models. While it may be possible to roughly model the effect of baryons in simulations without efficient feedback by raising the concentration parameters of the dark matter haloes (e.g. Zentner, Rudd & Hu, 2008), feedback from AGN redistributes the gas on scales that exceed those of their host haloes. The difference between dark matter only simulations and simulations that do include baryons is nearly the same for the WMAP3 and WMAP7 cosmologies, at least when AGN are included. This suggests that the relative effect of the baryons is roughly independent of cosmology, which will simplify future studies aiming to disentangle the two. For our most realistic simulation, which assumes the WMAP7 cosmology and includes AGN feedback, the difference in power relative to the corresponding dark matter only simulation does not evolve much for z ! 1 on large scales (k < 10 h Mpc−1 ). This is consistent with McCarthy et al. (2011), who showed that the expulsion of gas through AGN feedback occurs mostly at z ∼ 2 − 4, in the progenitors of today’s groups and clusters of galaxies. We demonstrated that our conclusions are robust with respect to changes in the size of the simulation box and changes in the resolution (see Appendix A), with any additional modelling uncertainties only making it less likely that the matter power spectrum can be predicted with 1% accuracy any time soon. Looking at the large differences that still exist between the results of different authors, it is clear that much work remains to be done in understanding processes such as gas cooling and outflows. In a follow-up paper (Semboloni et al., 2011), we study the implications of our findings for weak lensing surveys in more detail. In this work we also demonstrate that the use of optical and X-ray observations of groups of galaxies can significantly reduce the uncertainties in the predictions of the matter power spectrum. While this provides a strong incentive for obtaining better and more observations of 44 2.A Convergence tests groups of galaxies, it is important to note that such auxiliary data will never completely remove the uncertainty inherent to cosmological probes of the matter distribution on scales that are potentially affected by baryonic physics. This is because one can never be sure that all the relevant effects are constrained by the secondary observations. For example, it may be that other models for the baryonic physics exist that also reproduce optical and X-ray observations of groups, but nevertheless predict different power spectra. It will therefore be crucial to consider a wide variety of observations, with optical and X-ray as well as Sunyaev-Zel’dovich observations holding particular promise, and a large range of models. While the strong baryonic effects that we find imply that the cosmological constraints provided by upcoming weak lensing surveys will be model-dependent, it also means that such surveys will provide constraints on the physics of galaxy formation on scales that are difficult to obtain by other means. Tabulated values of power spectra for redshifts z = 0 − 6 are available for all the simulations shown in this chapter at http://www.strw.leidenuniv.nl/VD11/ (see Appendix B). Acknowledgements We thank all members of the OWLS team for their contributions to the project. We are also grateful to Henk Hoekstra, Elisabetta Semboloni and Simon White for discussions. We thank the Horizon Project for the use of their code powmes. Furthermore, in our comparison with analytical models we have utilized code from the Cosmology Initiative iCosmo. The simulations presented here were run on Stella, the LOFAR Blue Gene/L system in Groningen, on the Cosmology Machine at the Institute for Computational Cosmology in Durham as part of the Virgo Consortium research programme, and on Darwin in Cambridge. This work was sponsored by National Computing Facilities Foundation (NCF) for the use of supercomputer facilities, with financial support from the Netherlands Organization for Scientific Research (NWO). This work was supported by an NWO VIDI grant and by the Marie Curie Initial Training Network CosmoComp (PITN-GA-2009-238356). 2.A Convergence tests Here we investigate the effects of changing the box size or resolution of the reference simulation on its power spectrum. 2.A.1 Box size In Figure 2.9 we vary the size of the box at constant resolution. The difference between the power spectrum of the 100 and the 50 h−1 Mpc box is smaller than the difference between the latter and the 25 h−1 Mpc box for all k, and the power spectrum of the largest box is nearly converged for k " 20 h Mpc−1 . 45 Galaxy formation and the matter power spectrum However, there are differences of up to a factor of a few at larger scales. For reference, we also show the input power spectrum linearly evolved to z = 0 and the HALOFIT model of the non-linear power spectrum by Smith et al. (2003a) (see §2.3.1). The first wave mode corresponds to the size of the simulation box, which means that the power measured on this scale is meaningless; hence, we have omitted this point in all of our figures. The second and third wave modes closely follow the linear power spectrum. Note that the curves have very similar shapes on large scales, with the larger boxes shifted to larger scales. This is a consequence of employing the same seed for the random number generator used to create the initial conditions. Perturbations that should go non-linear (λ ! 10 h−1 Mpc) are unable to collapse if their wavelength is close to the size of the box, which in turn suppresses the power on smaller scales. One might therefore worry that even the 100 h−1 Mpc box is not large enough to obtain accurate power spectra for k " 1 h Mpc−1 . Lacking larger simulations to check this, we compare to the HALOFIT model for the non-linear power spectrum, which shows where the transition from the linear power spectrum should take place at redshift zero. The power spectrum for the 100 h−1 Mpc box follows this model very well on large scales, suggesting that a simulation of this size is very close to converged. Note that finite volume effects only prevent us from obtaining highly accurate absolute power spectra, and only for the largest scales, while our results are based on the relative comparisons between models that used identical initial conditions. Since the 100 h−1 Mpc box extends up to the largest non-linear scales, and since all simulations start from the exact same realisation of the linear power spectrum at z = 127, we do not expect our results to be affected by the finite volume of the simulations. 2.A.2 Numerical resolution In Figure 2.10 we investigate the effects of changing the resolution for the reference simulation by varying the number of particles while keeping the box size fixed. The power spectrum of REF−L100N128 is quite noisy for k " 100 h Mpc−1 because of its much higher Poisson noise level. Testing for convergence on these scales is only possible thanks to the accurate shot noise subtraction. Surprisingly, the relative difference in power between REF−L100N128 and REF−L100N512 is smaller than the difference between the latter and REF−L100N256. When increasing the resolution beyond that of REF−L100N256, the power begins to decrease. To examine if this trend continues, we compare the power spectrum of REF−L050N256, which has the same resolution as REF−L100N512, to that of REF−L050N512 in the panel on the right. We see that the power on the smallest scales (k " 10 h Mpc−1 ) converges only slowly, but that the trend of decreasing power with increasing resolution continues. This may indicate that, as lower mass haloes become resolved, the overall effects of supernova feedback become stronger. We can verify this by isolating the effects due to baryon physics from those due to a more straightforward dependence on resolution. To this end, we examine 46 2.A.2 Numerical resolution Figure 2.9: Test of convergence of the z = 0 matter power spectrum in the reference model with respect to the size of the simulated volume, where the box size and particle number are varied in such a way as to keep the resolution constant. Also shown are the linear input power spectrum and the analytical non-linear power spectrum by Smith et al. (2003a). The red, dotted line in the top panel shows the (subtracted) theoretical shot noise level. The bottom panel shows the ratio of REF−L100N512 with respect to the other simulations. what the effect is of increasing the particle number of the DMONLY simulation with a 100 h−1 Mpc box in Figure 2.11. The behaviour here is quite different: as N grows, more low-mass haloes are resolved and the power on small scales increases. As we observe a reversed trend in Figure 2.10, we conclude that the increased baryonic effects that accompany a higher particle number are more important for the power spectrum than the straightforward dependence on resolution. The difference between REF−L050N256 and REF−L050N512 is ∼ 0.1% at k = 1 h Mpc−1 and ∼ 2% at k = 10 h Mpc−1 . We conclude that simulation REF−L100N512 is sufficiently converged for the scales of interest for this study, k ! 10 h Mpc−1 . Note that, since we are only interested in the relative differences between simulations with equal resolution, the uncertainty will in practice be much smaller. With increased resolution we expect feedback processes to become more effective, meaning that we may have underpredicted the differences between models with different feedback processes in low-mass haloes on small scales. Similar tests were performed by Colombi et al. (2009) for the convergence of powmes, which keeps the statistical error bounded through its use of foldings. Its value depends on the quantity C(k), which is defined as the number of independent 47 Galaxy formation and the matter power spectrum Figure 2.10: As in Figure 2.9, but now the numerical resolution is varied while keeping the box size constant. The left and right panels show power spectra for box sizes of 100 and 50 h−1 Mpc, respectively. On small scales, k > 10 h Mpc−1 , convergence is slow. 48 2.B Tabulated power spectra Figure 2.11: Same as the left-hand panel of Figure 2.10, but now for DMONLY instead of REF. Here the behaviour is as expected: as the number of particles goes up, more low-mass haloes form and the power on small scales increases. A comparison with Figure 2.10 shows that increasing the resolution leads to stronger baryonic effects which may reverse the sign of the trend with resolution. wave modes at a given wave number k; to be more precise, we approximately have ∆P/P ∝ C(k)−1/2 (Colombi et al., 2009). For our fiducial grid with 2563 grid cells, one can expect the statistical error to remain below |∆P |/P ≈ 1.2% as long as errors due to shot noise do not dominate. We have checked that this is indeed the case. Note that this means that we can confidently measure 1% differences between simulations using our fiducial values, as we are interested in systematic offsets covering at least a small range of scales in k-space, rather than random deviations. 2.B Tabulated power spectra Table 2.2 shows the power spectrum values for our most current and realistic simulation to date, AGN−WMAP7−L100N512, for a subset of scales at z = 0. Our fiducial powmes values of 2563 grid points and 7 foldings were used, and shot noise has been subtracted. The full table, with power spectrum values at all scales shown in this chapter and redshifts up to z = 6, as well as tabulated data for all other simulations presented in this chapter, are available at 49 Galaxy formation and the matter power spectrum z k [h/Mpc] P (k) [h−3 Mpc3 ] ∆2 (k) 0.000 0.000 0.000 0.000 0.000 0.12566371 0.18849556 0.25132741 0.31415927 0.37699112 4364.4776 1853.4484 1524.3814 1112.5603 847.62970 0.43876514 0.62886024 1.2259802 1.7476056 2.3007519 Table 2.2: Power spectrum values for AGN−WMAP7−L100N512 for a subset of scales at z = 0 (full table available online). http://www.strw.leidenuniv.nl/VD11/. 50 3 The impact of baryonic processes on the two-point correlation functions of galaxies, subhaloes and matter The observed clustering of galaxies and the cross-correlation of galaxies and mass provide important constraints on both cosmology and models of galaxy formation. Even though the dissipation and feedback processes associated with galaxy formation are thought to affect the distribution of matter, essentially all models used to predict clustering data are based on collisionless simulations. Here, we use large hydrodynamical simulations to investigate how galaxy formation affects the autocorrelation functions of galaxies and subhaloes, as well as their cross-correlation with matter. We show that the changes due to the inclusion of baryons are not limited to small scales and are even present in samples selected by subhalo mass. Samples selected by subhalo mass cluster ∼ 10% more strongly in a baryonic run on scales r " 1 h−1 Mpc, and this difference increases for smaller separations. While the inclusion of baryons boosts the clustering at fixed subhalo mass on all scales, the sign of the effect on the cross-correlation of subhaloes with matter can vary with radius. We show that the large-scale effects are due to the change in subhalo mass caused by the strong feedback associated with galaxy formation and may therefore not affect samples selected by number density. However, on scales r ! rvir significant differences remain after accounting for the change in subhalo mass. We conclude that predictions for galaxy-galaxy and galaxy-mass clustering from models based on collisionless simulations will have errors greater than 10% on sub-Mpc scales, unless the simulation results are modified to correctly account for the effects of baryons on the distributions of mass and satellites. Marcel P. van Daalen, Joop Schaye, Ian G. McCarthy, C. M. Booth and Claudio Dalla Vecchia Monthly Notices of the Royal Astronomical Society Volume 440, Issue 4, pp. 2997-3010 (2014) Baryons and the two-point correlation function 3.1 Introduction Many cosmological probes are used in order to derive the values of the parameters describing our Universe, often relying on some aspect of large-scale structure. By combining different probes, degeneracies can be broken and the constraints on the numbers that characterise our Universe can be improved. However, observations alone are not enough: strong theoretical backing is needed to interpret the data and to avoid, or at least to reduce, unexpected biases. Modelling our Universe as a dark matter only ΛCDM universe was a reasonable approximation for the interpretation of past data sets. However, over the last few years it has become clear that for many probes this is no longer the case in the era of precision cosmology: ignoring processes associated with baryons and galaxy formation may lead to serious biases when interpreting data. The existence of baryons and the many physical processes associated with them have been shown to significantly impact, for example, the mass profiles (e.g. Gnedin et al., 2004; Duffy et al., 2010; Abadi et al., 2010; Governato et al., 2012; Martizzi et al., 2012; Velliscig et al., 2014) and shapes of haloes (e.g. Kazantzidis et al., 2004; Tissera et al., 2010; Bryan et al., 2013), the clustering of matter (e.g. White, 2004; Zhan & Knox, 2004; Jing et al., 2006; Rudd, Zentner & Kravtsov, 2008; Guillet, Teyssier & Colombi, 2010b; Casarini et al., 2011b; van Daalen et al., 2011) and, subsequently, weak lensing measurements (e.g. Semboloni et al., 2011; Semboloni, Hoekstra & Schaye, 2013; Yang et al., 2013; Zentner et al., 2013), the strong lensing properties of clusters (e.g. Mead et al., 2010; Killedar et al., 2012), and the halo mass function (e.g. Stanek, Rudd & Evrard, 2009; Cui et al., 2012; Sawala et al., 2013; Balaguera-Antolínez & Porciani, 2013; Martizzi et al., 2014; Velliscig et al., 2014). To complicate matters further, different authors studying the same aspects of galaxy formation often find different and sometimes even contradictory results, depending not only on which physical processes are modelled but also on the choice of numerical code, and particularly on the implementation of subgrid recipes for feedback from star formation and Active Galactic Nuclei (hereafter AGN) (e.g. Scannapieco et al., 2012). Until a consensus can be reached, it is therefore important to determine the range of values that observables can take depending on whether certain baryonic processes are included in a model, and the way in which they are implemented. In this chapter, we aim to quantify the effects of baryons and galaxy formation on the two-point real-space correlation function. Specifically, we will investigate how the redshift zero galaxy and subhalo correlation functions and the galaxymatter cross-correlation, which is observable through galaxy-galaxy lensing, are changed if baryonic processes are allowed to influence the distribution of matter to varying degrees, i.e. using different feedback models. To this end, we will use the reference and AGN models from the OverWhelmingly Large Simulations project (OWLS, Schaye et al., 2010). These were also employed by van Daalen et al. (2011, see Chapter 2) and we have since repeated them using larger volumes, more particles and a more up-to-date cosmology. The AGN model is particularly 52 3.1 Introduction relevant, as it has been shown to reproduce many relevant X-ray and optical observations of groups and clusters (McCarthy et al., 2010, 2011; Stott et al., 2012). Any changes in the clustering of objects brought about by galaxy formation can enter into the correlation function in two ways. The first and most well-established effect is due to a change in the mass of the objects. For example, assuming that higher-mass haloes are more strongly clustered, if supernova feedback systematically lowers the stellar content of haloes, then a model which includes this process is expected to show increased clustering at fixed stellar mass relative to one that does not.1 Likewise, the clustering of haloes at fixed halo mass is also expected to show increased clustering when efficient feedback is included, due to the total mass of the halo being lowered. Secondly, the positions of galaxies and haloes may shift due to changes in the physics: if the mass within a certain radius around an object changes, then the gravitational force acting on those scales will change as well, affecting the dynamics of nearby galaxies and haloes. Moreover, tidal stripping, and hence also dynamical friction, will affect satellites differently if baryonic processes change the density profiles of either the satellites or the host haloes. We will consider both types of effects here; most importantly, we will disentangle the two and show what effects remain after we account for the change in halo mass, as could be done approximately by selecting samples with constant number density. As we will see, not all shifts in position average out, nor can the modification of the halo profiles be ignored. Quantifying the significance of the various ways in which clustering measurements may deviate from those in a dark matter only universe is vital for the improvement of current models employed in clustering studies. Typically these are based on the distribution of dark matter alone, be they semi-analytical models (see Baugh, 2006, for a review), a combination of halo occupation distribution (HOD) and halo models (e.g. Jing, Mo & Börner, 1998; Berlind & Weinberg, 2002; Cooray & Sheth, 2002; Yang, Mo & van den Bosch, 2003; Kravtsov et al., 2004; Tinker et al., 2005; Wechsler et al., 2006; van den Bosch et al., 2013) or subhalo abundance matching (SHAM) models (e.g. Vale & Ostriker, 2004; Shankar et al., 2006; Conroy, Wechsler & Kravtsov, 2006; Moster et al., 2010; Guo et al., 2010; Behroozi, Conroy & Wechsler, 2010; Simha & Cole, 2013). It is therefore important to investigate which ingredients may currently be missing from such efforts. The effects of galaxy formation on subhalo-subhalo clustering were previously considered by Weinberg et al. (2008) and Simha et al. (2012). Weinberg et al. (2008) compared the clustering of objects at fixed number density in a dark matter only simulation with a baryonic simulation including weak supernova feedback but no feedback from AGN, and with identical initial conditions. They found that 1 Situations in which feedback would have the reverse effect are possible in principle. For example, if the stellar mass - halo mass relation were flat where AGN feedback is important and had a large scatter, then the stellar mass of some galaxies inhabiting such haloes could be lower than that of galaxies in lower-mass haloes. As a result, the most massive galaxies would reside in intermediate mass haloes. However, such a scenario is not supported by our simulations. 53 Baryons and the two-point correlation function subhaloes cluster more strongly on small scales in the baryonic simulation due to the increased survival rate of baryonic satellites during infall. While we find a similar increase in the autocorrelation of subhaloes on small scales (r ! rvir ) – with a corresponding decrease in clustering on slightly larger scales – we point out that such results may be biased, due to the difficulties of detecting infalling dark matter satellites (e.g. Muldrew, Pearce & Power, 2011, , see our Appendix 3.B). Simha et al. (2012) extended the work of Weinberg et al. (2008) in several ways, among which are the addition of more effective stellar feedback and the use of the mass of the subhalo at infall, rather than the current mass, when assigning galaxy properties to the subhaloes. They find that the addition of effective feedback causes the discrepancies between clustering in hydrodynamical simulations and results from subhalo abundance matching to increase. They demonstrate that the two-point correlation function of baryonic subhaloes can be recovered to better than 15% on scales r > 2 h−1 Mpc when winds are included, but that the discrepancy at smaller scales in these simulations can be up to a factor of a few. The galaxy correlation function is reproduced much better if the stellar mass threshold is raised; however, as these simulations do not contain any form of feedback that is effective at high stellar masses, we would expect the further addition of a process like AGN feedback to exacerbate the discrepancy between subhalo abundance matching results and hydrodynamical simulations for massive galaxies. This chapter is organized as follows. We will briefly introduce our simulations and explain how we calculate the relevant quantities in §3.2. Here we will also discuss how we identify the same halo in different simulations, an essential step in order to separate the change in halo mass from other effects. We present our results in §3.3 and summarise our findings in §3.4. Finally, we show the convergence with resolution and box size in Appendix 3.A and consider the fraction of subhaloes successfully linked between simulation in Appendix 3.B. 3.2 Method 3.2.1 Simulations We consider three models from the OWLS project (Schaye et al., 2010): DMONLY, REF and AGN. All of these simulations were run with a modified version of gadget iii, the smoothed-particle hydrodynamics (SPH) code last described in Springel (2005b). We will discuss the models employed briefly below. In order to study relatively low-mass objects while also simulating a volume that is sufficiently large to obtain a statistical sample of high-mass objects, we combine the results of simulations with different box sizes. For each model, we ran simulations in periodic boxes of comoving side lengths L = 200 and 400 h−1 Mpc, both with N 3 = 10243 CDM particles and – with the exception of DMONLY – an equal number of baryonic particles. The gravitational forces are softened on a comoving scale of 1/25 of the initial mean inter-particle spacing, L/N , but the softening length is limited to a maximum physical scale of 1 h−1 kpc[L/(100 h−1 Mpc)]. 54 3.2.2 Calculating correlation functions The particle masses in the baryonic L200 (L400 ) simulations are 4.68×108 h−1 M( (3.75 × 109 h−1 M( ) for dark matter and 9.41 × 107 h−1 M( (7.53 × 108 h−1 M( ) for the baryons. We will use the higher-resolution L200 simulations to study the clustering of galaxies with stellar mass M∗ < 1011 h−1 M( and subhaloes with total mass Msh < 1013 h−1 M( , while taking advantage of the larger volume of the L400 simulations to study higher masses. When considering cross-correlations with the matter distribution, resolution is more important than volume, and we use the L200 simulations at all masses. We discuss our choice of mass limits in Appendix 3.A, where we also show resolution tests. All the simulations we employ in this chapter were run with a set of cosmological parameters derived from the Wilkinson Microwave Anisotropy Probe (WMAP) 7-year results (Komatsu et al., 2011), given by {Ωm , Ωb , ΩΛ , σ8 , ns , h} = {0.272, 0.0455, 0.728, 0.81, 0.967, 0.704}. It is important to note that all simulations with identical box sizes were run with identical initial conditions, which allows us to compare the effects of baryons and galaxy formation for the exact same objects. The DMONLY simulation, as its name suggests, contains only dark matter. This provides us with a useful baseline model for testing the impact of baryon physics. The REF simulation is the reference OWLS model. It includes sub-grid recipes for star formation (Schaye & Dalla Vecchia, 2008), radiative (metal-line) cooling and heating (Wiersma, Schaye & Smith, 2009), stellar evolution, mass loss from massive stars and chemical enrichment (Wiersma et al., 2009) and a kinetic prescription for supernova feedback (Dalla Vecchia & Schaye, 2008). The reference simulation is not intended to be the most realistic, but instead includes only those physical processes most typically found in simulations of galaxy formation. The third and final simulation we consider here, AGN, adds feedback from accreting supermassive black holes to the reference simulation. AGN feedback was modelled following the prescription of Booth & Schaye (2009), which built on the model of Springel, Di Matteo & Hernquist (2005). We believe AGN to be our most realistic model, as it is the only model that solves the well-known overcooling problem (e.g. Balogh et al., 2001) and that reproduces the observed properties of groups (McCarthy et al., 2010, 2011; Stott et al., 2012). Specifically, this model has been shown to reproduce the gas density, temperature, entropy, and metallicity profiles inferred from X-ray observations, as well as the stellar masses, star formation rates, and stellar age distributions inferred from optical observations of low-redshift groups of galaxies. van Daalen et al. (2011) used this model to show that AGN feedback has a dramatic effect on the clustering of matter; here we wish to investigate whether the effect on the clustering of galaxies and subhaloes is equally important. 3.2.2 Calculating correlation functions The correlation function, ξ(r), returns the excess probability, relative to a random distribution, of finding two objects at a given separation r. It is therefore a measure 55 Baryons and the two-point correlation function of the clustering of these objects as a function of scale. As our simulations contain only a moderate number of resolved objects (i.e. galaxies and (sub)haloes), we do not need to resort to approximations that are common in the calculation of twopoint clustering statistics. Instead, we can use a parallelised brute force approach in which we obtain the (cross-)correlation function through simple pair counts, using the relation: DDXY (r) ξXY (r) = − 1. (3.1) RRXY (r) Here X and Y denote two (not necessarily distinct) sets of objects (e.g. galaxies and particles or galaxies and galaxies), DDXY (r) is the number of unique pairs consisting of an object from set X and an object from set Y separated by a distance r, and RRXY (r) is the expected number of pairs at this separation if the positions of the objects in these sets were random. As our simulations are carried out with periodic boundary conditions, more complicated expressions involving cross terms of the form DRXY (r) (e.g. Landy & Szalay, 1993) are not necessary, nor do we need to actually create random fields; instead, we can simply compute the term in the denominator analytically. The basic functions that we will consider in this chapter are the galaxy autocorrelation function, ξgg , the galaxy-mass cross-correlation function, ξgm , the subhalo autocorrelation function, ξss , and the subhalo-mass cross-correlation function, ξsm . We divide galaxies and subhaloes into different bins according to their stellar and subhalo dark matter mass, respectively. When cross-correlating with matter, we weight particles by their mass. To keep the computation time manageable, we use only 25% of all particles for the lowest mass bin of the simulations with (2×)10243 particles, randomly selected. In all other mass bins, we cross-correlate with the full particle distribution. We have verified that this does not influence our results in any way. Throughout this chapter we will focus on the three-dimensional correlation function. We will only show the correlation functions in radial bins where the number of pairs exceeds 10, to prevent our results from being dominated by spurious clumping. We take the position of our objects to be the position of their most-bound particle, and assign each galaxy a mass equal to the total mass in stars in its subhalo. Finally, we confine our analysis to scales r ! 20 h−1 Mpc, corresponding to at most 1/10th of box size, in order to avoid the effects of missing large-scale modes. 3.2.3 Linking haloes between different simulations As discussed previously, there are two main ways in which the two-point correlation function may be affected by baryonic processes: through changes in the masses of objects, and through shifts in their positions. To disentangle the two effects, we make use of the fact that all OWLS models were run from identical initial conditions, allowing us to identify the same objects in different simulations. In this way we can assign each object in simulation B the mass that the same object 56 3.3 Results possesses in simulation A, thereby isolating the effect of changes in the positions of objects on the clustering signal. Haloes are identified in our simulations using the Friends-of-Friends algorithm (run on the dark matter particles, with linking length 0.2) combined with a spherical overdensity finder, as implemented in the subfind algorithm (Springel et al., 2001; Dolag et al., 2009). For every (sub)halo in simulation A we flag the Nmb most-bound dark matter particles, meaning the particles with the highest absolute binding energy. Next, we locate these particles in the other simulations, using the unique number associated with every particle. If we find a (sub)halo in simulation B that contains at least 50% of these flagged particles, a first link is made. The link is confirmed if, by repeating the process starting from simulation B, the previous (sub)halo in simulation A is found. Here we use Nmb = 50, but we have verified that our results are insensitive to this choice (see Velliscig et al., 2014). For haloes with less than Nmb dark matter particles, all dark matter particles are used. The fraction of haloes linked quickly increases as a function of mass, reaching essentially unity for sufficiently well-resolved haloes. For all subhaloes employed in this work, the linked fraction of DMONLY subhaloes typically exceeds 99%, the exception being the lowest mass bin where the linked fraction is around 98%. However, at small separations the linked fraction can be much smaller. This is explored in more detail in Appendix 3.B. 3.3 Results In this section we will explore the effects of baryon physics on the two-point correlation function at redshift zero. We will first consider the galaxy-galaxy and galaxymatter correlation functions as these are the most directly observable. Since stellar masses are strongly model-dependent, we will switch from galaxies to subhaloes in §3.3.2, which allows us to examine how clustering statistics derived from dark matter only simulations will differ from those including baryons. Finally, in §3.3.3, we will take the change in the mass of subhaloes out of the equation, and consider the change in the correlation function for the exact same objects as a function of the model used. 3.3.1 Clustering of galaxies 3.3.1.1 Autocorrelation In Figure 3.1 we plot the galaxy autocorrelation functions, ξgg (r), for models REF and AGN in three different bins of stellar mass, as indicated in the legend. The bottom panel shows the relative difference in the clustering strength of galaxies in these models. Since the clustering of haloes increases with mass, and since AGN feedback reduces the stellar content of massive haloes, one would expect galaxies in the AGN simulation to be more strongly clustered at fixed (high) stellar mass. 57 Baryons and the two-point correlation function Figure 3.1: The galaxy autocorrelation function for the REF and AGN simulations (top), as well as the fractional difference between the two (bottom). Different colours correspond to different stellar masses, as indicated in the legend. The legend also shows the number of galaxies in each bin for each simulation (REF,AGN ). At any mass, galaxies in AGN are more highly clustered than those in REF on large scales, an effect that increases sharply above 1012 h−1 M& , where AGN feedback is most important. Note that these effects may be underestimated for the two highest mass bins for reasons discussed in §3.3.1.3. The relative decrease in clustering for the AGN simulation on small scales is mostly a numerical effect (see text). As higher-mass galaxies are expected to host more powerful AGN, this effect is expected to increase with mass. This is indeed what we observe in Figure 3.1: as long as we consider sufficiently large scales, galaxies in the AGN simulation show increased clustering relative to those in REF, and the relative difference between clustering strengths in the two simulations tends to increase with mass. For galaxies with stellar masses M∗ < 1010 h−1 M( we expect the effect to be minor, since in such low-mass objects feedback is controlled by stellar rather than AGN feedback in these models (e.g. Haas et al., 2013). Also indicated in the legend are the number of galaxies in each mass bin for each simulation, the first number corresponding to REF and the second to AGN. Because AGN feedback systematically lowers the stellar content of massive haloes, and since the number density of haloes decreases with mass, the AGN simulation suffers from somewhat worse statistics at high stellar masses than the REF simulation. However, this effect is only seen in the highest mass bin, M∗ > 1012 h−1 M( , and even in this mass range we can still draw robust conclusions for scales r > 2 h−1 Mpc. Note that any two subhaloes must have a finite minimum distance between 58 3.3.1 Clustering of galaxies them in order to, on the one hand, be recognised as separate objects, and on the other, not be tidally destroyed. As we identify galaxies by the subhaloes they occupy, this causes a slight turnover in the galaxy correlation functions on small scales. Since this minimum distance increases with the size and therefore mass of the subhaloes hosting the galaxies, at fixed stellar mass this turnover is seen at larger scales in the model AGN than in REF. This in turn causes the galaxies in AGN to appear less clustered on small scales. 3.3.1.2 Cross-correlation with matter Figure 3.2 shows the galaxy-matter cross-correlation functions for these simulations, which are relevant for galaxy-galaxy lensing. Due to the high number of particles relative to the number of galaxies, the statistics are significantly improved relative to Figure 3.1, and we can see clearly that including AGN feedback greatly increases the clustering of matter and galaxies at fixed stellar mass.2 . The relative increase of clustering with mass is more strongly scale-dependent than for the galaxy-galaxy case. The relative difference in clustering strength between AGN and REF is largest around 1 h−1 Mpc for the most massive galaxies, where galaxies at fixed stellar mass are nearly twice as strongly clustered with matter when AGN are included. At larger scales, AGN always shows ∼ 50% stronger clustering than REF for M∗ > 1012 h−1 M( . Even for galaxies in the stellar mass range 1011 < M∗ /[M( /h] < 1012 we see an increase in clustering of up to 150% around 70 h−1 kpc, and an offset of ∼ 20% at all larger scales. Interestingly, the relative difference in the galaxy-matter cross-correlation functions between AGN and REF increases towards smaller scales before suddenly dropping, causing galaxies to become less strongly clustered with the matter distribution in the AGN simulation on the very smallest scales probed here. This behaviour is caused by two competing effects, a point we will return to when discussing the subhalo-matter cross-correlation function in the next sections. On the one hand, the lowering of the stellar mass by AGN feedback tends to increase clustering at fixed stellar mass, and more so towards smaller scales, as galaxies of the same stellar mass now inhabit denser environments. On the other hand, as shown in e.g. Velliscig et al. (2014), a large amount of gas – and even dark matter – is removed from the galaxy, and sometimes from the halo entirely, decreasing the density peaks in the matter distribution (see e.g. van Daalen et al., 2011). As we can see from Figure 3.2, the latter effect dominates on sub-galaxy scales (r ! 10 h−1 kpc). 2 Note that the number of objects in the two most massive bins, shown in the legend, is lower than for the autocorrelation function. This is because we now use the higher-resolution L200 for all mass bins, whereas we previously used L400 for the two highest mass bins to obtain better statistics (see §3.2.1). 59 Baryons and the two-point correlation function Figure 3.2: As in Figure 3.1, but now showing the galaxy-matter cross-correlation function for the REF and AGN simulations. Except for sub-galactic scales, AGN feedback tends to increase the clustering of galaxies with matter at fixed stellar mass. Both the overall magnitude of the effect and the length scales over which it occurs increase with stellar mass, and for M∗ > 1012 h−1 M& the increase in clustering with the matter distribution reaches values as high as 180%. 3.3.1.3 Caveats We note that the effect of AGN feedback may be underestimated for massive galaxies due to two effects. The first only applies to the two highest mass bins and only to results based on the L400 runs (i.e. the autocorrelation functions): the implementation of AGN feedback in these simulations is somewhat resolution dependent, and as a consequence its effect is weaker in the 400 h−1 Mpc box than in the 200 h−1 Mpc simulation. This is because the seed black holes can only be injected into resolved haloes, which corresponds to a minimum mass, that is 8 times higher in the L400 simulation than in the L200 simulation (i.e. the difference in mass resolution). The result is that AGN feedback in the 400 h−1 Mpc box, used in the two highest mass bins in Figures 3.1 and 3.2, may be too weak for galaxies occupying haloes with masses M ! 1013 h−1 M( . In fact, while the effect of resolution is small for galaxies with masses M∗ > 1012 h−1 M( , for 1011 < M∗ /[M( /h] < 1012 the effect is significant: when using the higher-resolution L200 simulation in this mass bin, we find an increase in galaxy-galaxy clustering relative to REF of ∼ 50% for r " 2 h−1 Mpc. The second effect is due to the way stellar mass is estimated in observations, where the use of an aperture excludes intracluster light. For the more massive 60 3.3.2 Clustering of subhaloes galaxies in our sample, which host the most powerful AGN, this aperture size is typically significantly smaller than the size of the region containing the stars. However, simulated galaxies are assigned a stellar mass equal to the total mass in stars in its subhalo. The stellar mass of our most massive galaxies is therefore significantly higher than would be estimated observationally. Hence, the strong effects of AGN feedback that we find will be relevant for lower observed stellar masses than suggested by our plots. Regardless, even without taking these effects into account, it is clear that AGN feedback plays an important role in the clustering of galaxies and matter, and should not be ignored in theoretical models that aim to predict ξgm (r) to ∼ 10% accuracy or better, even when only considering relatively low stellar masses (M∗ = 1010 − 1011 h−1 M( ). At this point it is important to note that although our model AGN reproduces the stellar masses of group-sized haloes relatively well (McCarthy et al., 2010, 2011), predicted stellar masses are generally strongly model-dependent, as well as cosmology-dependent. Abundance matching studies, on the other hand, reproduce the stellar mass-halo mass relation by construction (e.g. Moster et al., 2010). Since clustering models typically employ the results from such studies, which in turn rely on dark matter only simulations, it is useful to consider the clustering of the subhaloes that host the galaxies and to select objects by their total subhalo mass, instead of by their stellar mass. This also allows us to consider the effect of galaxy formation relative to a dark matter only scenario. For the remainder of this chapter, we will therefore focus on the clustering of subhaloes. 3.3.2 Clustering of subhaloes 3.3.2.1 Autocorrelation The top panel of Figure 3.3 shows the subhalo autocorrelation function, ξss (r), for three different simulations: DMONLY, REF and AGN. Different colours indicate different subsamples, selected by the total mass of the subhaloes, Msh,tot , though we note that the results would have been very similar had we selected by dark matter mass. The correlation functions are displayed in the top panel, while in the middle panel and bottom panels the baryonic simulations are compared to DMONLY. From the top panel we can already see that subhalo clustering in the dark matter only simulation behaves quite differently from that in the baryonic models, especially on small scales (r ! 1 h−1 Mpc). Vertical dotted lines indicate the median virial radii3 of subhaloes in each mass bin, which are similar to the scale at which the subhalo correlation functions for DMONLY turn over. 3 We computed a characteristic size, rvir , for each subhalo by taking its total mass, Msh,tot , and treating it as the mass within a region with a mean overdensity of ∆ = 200 relative to ρcrit (i.e. rvir ≈ r200c ). For reference, for a typical dark matter halo r500c ∼ 0.65 − 0.75 r200c , where r500c corresponds to the radius out to which the dominant baryonic component (hot gas) of groups and clusters is typically measured (e.g. Vikhlinin et al., 2006). 61 Baryons and the two-point correlation function Figure 3.3: The subhalo autocorrelation function, ξss (r), for DMONLY (solid), REF (dashed) and AGN (dot-dashed lines), and the fractional differences between them. Different colours are used for different total subhalo masses, and the number of objects in each bin is indicated in the legend (DMONLY, REF, AGN ). Top: The correlation functions for the three simulations. Vertical dotted lines indicate the median rvir of the subhaloes. Middle: The fractional difference of subhalo clustering in REF relative to DMONLY. The curves are greyed out for radii where they may be biased due to subhalo non-detections (see Appendix 3.B). Bottom: The fractional difference of subhalo clustering in AGN relative to DMONLY. Both baryonic simulations show increased clustering, and this effect is stronger on smaller scales. Note that the range on the y-axis is much smaller here than in Figure 3.1. 62 3.3.2 Clustering of subhaloes At the high-mass end, all three simulations show very similar behaviour. Looking at the middle and bottom panels, where we compare the autocorrelation of subhaloes in REF and AGN respectively to that in DMONLY, we see that all subhaloes in the baryonic simulations are typically ∼ 10% more strongly clustered on large scales than their dark matter only counterparts. As we will demonstrate in §3.3.3, this difference is due to the reduction of subhalo mass caused by baryonic processes. For the larger subhaloes, 1013 < Msh,tot /[M( /h] < 1014 , this offset is somewhat larger when AGN feedback is included, because supernova feedback alone cannot change the subhalo mass by as much as it can for lower halo masses (e.g. Sawala et al., 2013; Velliscig et al., 2014). The offset in clustering strength relative to DMONLY of the lowest-mass subhaloes is also slightly increased by the addition of AGN: while the masses of these subhaloes may seem to be somewhat low to be significantly affected by AGN feedback, we should keep in mind that satellite subhaloes may have lost part of their mass through tidal stripping. Moreover, these would correspond to subhaloes of a higher mass in a DMONLY simulation, as a significant fraction of the mass has been expelled. Additionally, low-mass subhaloes do not need to host AGN themselves to be affected by them: satellites in groups and clusters are sensitive to changes in the host halo profile and possibly increased stripping caused by the powerful AGN in the more massive galaxies in their environment. The differences between the baryonic and dark matter only simulations increase rapidly for r < 2rvir , at least for Msh,tot < 1014 h−1 M( . As we can see most easily in the top panel, subhaloes in the REF simulation are significantly more clustered on small scales than those in the AGN simulation, which seems to contradict the results of the previous section. This is because subhaloes in the REF simulation are more compact at fixed mass than those in the AGN simulation, due to the additional form of feedback in the latter which removes more material from the centre and lowers the concentration in the inner parts of the subhaloes. However, the haloes in the AGN simulation are still more compact than those in DMONLY (see e.g. Velliscig et al., 2014). The increased concentration of subhaloes in baryonic simulations allows them to be identified as separate objects down to smaller scales, and also to withstand the effects of tidal stripping longer than their dark matter only counterparts. Both these effects tend to increase the clustering on small scales. This relative increase in the number density of subhaloes close to the centres of haloes in baryonic simulations was seen before by e.g. Macciò et al. (2006), Libeskind et al. (2010), Romano-Díaz et al. (2010) and Schewtschenko & Macciò (2011) (although Romano-Díaz et al. 2010 note that without strong feedback, the effect may be reversed). On the other hand, baryonic subhaloes are generally less massive when they are centrals, and those that become satellites typically fall in later due to the smaller virial radius of the main halo compared to a pure dark matter run, which means that they should experience less dynamical friction on scales where tidal stripping is not yet important. This is indeed what Schewtschenko & Macciò (2011) find, although this effect cannot be seen for the mass-selected sample shown in Figure 3.3 due to the much larger effect of the 63 Baryons and the two-point correlation function change in mass. We explore the clustering behaviour of baryonic satellites in more detail in §3.3.3.1. For now, we note that if our ability to detect baryonic subhaloes down to smaller radii than pure dark matter ones were the dominant cause of an increased number density of subhaloes at small separations in REF and AGN, this would introduce a bias towards observing a stronger clustering signal in baryonic models on scales r ! 2rvir .4 We discuss this possible source of error in Appendix 3.B, and based on the results reported there we have chosen to show the relative differences in clustering as grey dot-dot-dot-dashed curves in Figure 3.3 for subhalo masses and scales that may be significantly affected by this bias. Comparing Figures 3.1 and 3.3, we see that the single act of adding AGN feedback affects the clustering of galaxies and subhaloes very differently. For galaxies, a strong increase in clustering is found for the highest-mass galaxies, and on large scales, since the same subhaloes host galaxies with a much lower stellar mass when AGN feedback is added. Low-mass galaxies are, however, not strongly affected by AGN feedback. For subhaloes, on the other hand, we find that the largest effects are found on small scales, and especially at the lowest masses: we find a strong decrease in clustering for r ! rvir when adding AGN feedback to the reference model, regardless of halo mass, and far less change on large scales. These two main differences have two different causes. The large-scale differences between the effect of AGN feedback on galaxies and on subhaloes is that while AGN are powerful enough to quench star formation and to remove a lot of gas from galaxies, thus lowering the stellar mass, they are not powerful enough to significantly change the halo mass. However, as is shown in detail by Velliscig et al. (2014), and as we will also see in the next section, they do have a significant effect on the density profiles of subhaloes, and through this on their distribution. At fixed mass, the subhaloes in REF are more compact and more massive than those in AGN, causing both the satellite survival rate and the dynamical friction experienced by satellites to increase, which in turn causes the small-scale differences in clustering we just discussed. 3.3.2.2 Cross-correlation with matter We consider the subhalo-mass cross-correlation function in Figure 3.4. From the top and middle panels we observe, as was the case for galaxy-galaxy clustering, that on the smallest scales and at fixed total mass, subhaloes cluster far more strongly with matter in the baryonic simulations than in the dark matter only simulations. Additionally, there is a constant 5% offset in favour of baryonic simulations on the largest scales, for all halo masses. The baryonic bias increases as we move from large scales towards the virial radius, but, interestingly, the strength of the effect decreases below scales approximately corresponding to rvir before picking up again at the smallest scales shown. This decrease below rvir even causes the lowest-mass DMONLY subhaloes to be more strongly clustered than their REF 4 We 64 thank Raul Angulo for pointing out this potential problem. 3.3.2 Clustering of subhaloes Figure 3.4: As Figure 3.3, but now for the subhalo-mass cross-correlation function, ξsm (r). Subhaloes are generally more strongly clustered with matter in the baryonic simulations than in DMONLY. The largest differences are found for REF, for which ξsm (r) can be up to 40% higher on intermediate scales for the lowest-mass subhaloes, and much higher still for any subhalo mass if sufficiently small scales are considered. There is also a constant 5% difference in favour of the baryonic simulations on large scales, regardless of subhalo mass. While the AGN model seems to increase clustering at fixed subhalo mass less than REF, it does show a stronger decrease in clustering up to scales r ∼ 102 h−1 kpc. Note that in both cases the clustering differences between the models are strongly non-monotonic, which is caused by the interplay between the change in the total subhalo mass and the change in the subhalo mass profiles. 65 Baryons and the two-point correlation function counterparts around r = 20 h−1 kpc. For AGN, this happens even for the highestmass subhaloes, and over a larger range of scales. As we will show in the next section, the strongly non-monotonic behaviour of the relative difference in ξsm between the baryonic simulations and DMONLY is caused by two counteracting effects. On the one hand, the lowered halo masses in the baryonic simulations tend to increase clustering at fixed mass on all scales. On the other hand, while the dissipation associated with galaxy formation causes the inner halo profile to steepen, increasing clustering on small scales, the associated feedback causes the outer layers of the halo to expand, decreasing clustering on intermediate scales. This effect is stronger when AGN feedback is included. Note that we observe similar behaviour for the relative differences between the galaxymatter cross-correlation functions for REF and AGN. Furthermore, by comparing the bottom two panels, we can see that for low halo masses (Msh,tot < 1012 h−1 M( ), for which AGN feedback is not very important, the small-scale clustering of haloes in REF and AGN is nearly identical, while subhaloes and matter cluster much more weakly on a range of scales around rvir in AGN. On the other hand, for higher-mass haloes (Msh,tot > 1012 h−1 M( ), significant differences can be seen down from the smallest scales out to r ∼ 1 h−1 Mpc. This again confirms the strong effect that AGN feedback has on the mass distribution: the higher the mass of the halo, the more important feedback from supermassive black holes is in removing material from the centre. This in turn flattens the mass profiles of the haloes and smooths out the density peaks, decreasing the small-scale lensing signal relative to REF. As we have already pointed out several times, the most important cause of the increase in clustering due to galaxy formation with strong feedback is the lowering of the mass of objects. However, secondary effects, such as the resulting changes in the dynamics and density profiles of haloes, are also expected to be significant. To disentangle these types of effects, we will use our linking scheme to match subhaloes between different simulations, allowing us to see if any significant difference in the clustering remains once the change in mass has been accounted for. 3.3.3 Accounting for the change in mass As we are mainly interested in how galaxy formation changes the clustering of objects with respect to a dark matter only scenario, we use the linking algorithm described in §3.2.3 to link subhaloes in REF and AGN to those in DMONLY, and assign all objects the mass of their DMONLY counterpart. Note that this means that there are in fact two different DMONLY versions of each correlation function: one derived using all subhaloes for which a counterpart was found in REF, and one derived using all subhaloes for which a counterpart was found in AGN. In practice, however, the linked halo samples are nearly identical, and the resulting correlation functions for DMONLY are virtually indistinguishable. We therefore show only one of these in the top panels of Figures 3.5 and 3.6, although 66 3.3.3 Accounting for the change in mass Figure 3.5: As Figure 3.3, but now only showing the autocorrelation functions for subhaloes linked between a baryonic simulation and DMONLY, and selected based on their mass in the latter. Relative to Figure 3.3, this procedure removes the effects of changes in the subhalo masses. As the numbers in the legend imply, almost the exact same haloes are linked with dark matter only haloes in both cases. The bottom two panels immediately show that in all cases no differences " 5% in ξss remain on scales r ' rvir , indicating that the differences we saw in Figure 3.3 on these scales were due to the masses of the objects changing. For smaller scales, and especially for low-mass subhaloes, the change in dynamics of the objects in the baryonic simulations can have significant effects, which can primarily be seen as a decrease in clustering on scales r ! 2rvir . Shaded areas indicate the regions allowed by 1σ bootstrap errors, which show that the relative small-scale decrease of clustering of low-mass baryonic subhaloes is significant. 67 Baryons and the two-point correlation function both are used to determine the differences with respect to REF and AGN. 3.3.3.1 Autocorrelation of linked subhaloes We first consider Figure 3.5, where we show the impact of galaxy formation on the clustering of subhaloes once the change in mass has been accounted for. Comparing first the sample sizes (numbers in the legend) to those in Figure 3.3, we see that nearly all DMONLY subhaloes have a match in each of the baryonic simulations.5 Note that the first number in the legend now indicates the sample size of subhaloes linked between DMONLY and REF, while the second gives the number of subhaloes linked between DMONLY and AGN. We have now also performed 500 bootstrap resamplings for each pair of simulations, and show the 1σ errors derived from these as shaded areas in the figure. As we are now using the exact same (linked) sample of subhaloes for any pair of simulations, we are able to avoid overestimating the errors due to the false assumption that the halo samples of the simulations are independent. Similar errors are expected for Figure 3.3. Comparing the bottom two panels of Figure 3.5 to those of Figure 3.3, we immediately see that essentially nothing of the ∼ 10% difference in the clustering amplitude on large scales remains, confirming that this was solely due to galaxy formation changing the masses of these subhaloes. By accounting for the change in the masses of objects due to the effects of baryon physics, one will therefore automatically obtain the correct autocorrelation function at all halo masses, on scales r + rvir . However, on smaller scales the changes in the dynamics of subhaloes in the baryonic runs become important. This is especially the case for low-mass objects, which are often satellites. As we discussed in §3.3.2.1, Schewtschenko & Macciò (2011) have shown that, initially, satellites in dark matter only simulations move in closer to the centre of the main halo in the same amount of time, which is due in part to the decrease in the virial radius of the main halo when baryons are included (also found for baryonic haloes in our simulations, see Velliscig et al., 2014), and in part to the increased dynamical friction experienced by the more massive dark matter satellites. However, as the satellites undergo tidal stripping, baryonic subhaloes are able to retain more of their mass due to their increased concentrations, which causes the situation to reverse on small scales, increasing the number density of baryonic subhaloes relative to pure dark matter ones. This was also found by e.g. Macciò et al. (2006), Libeskind et al. (2010) and RomanoDíaz et al. (2010). However, at the same time one expects to see an increase in the number density – and consequently, the clustering – of baryonic satellite subhaloes at small scales due to the ability to trace baryonic subhaloes longer during infall. This resolution effect could lead to a bias at small separations. 5 As we now select subhaloes by the mass of their DMONLY counterpart, the number of subhaloes can only be directly compared to those of DMONLY in Figure 3.3, not to the number of baryonic subhaloes in Figure 3.3. 68 3.3.3 Accounting for the change in mass To account for this potential bias, we consider the fraction of subhaloes in DMONLY for which a link could be found in REF in Appendix 3.B. There we show that the fraction of linked subhaloes decreases strongly on small scales for low-mass subhaloes. Higher-resolution simulations are needed to investigate whether the increased survival rate of baryonic subhaloes, and the resulting increase in clustering seen in Figures 3.3 and 3.5 on scales r ! rvir , is physical or not. We have therefore greyed out the curves in these figures on scales where this bias may play a significant role. However, even after accounting for this potential bias, interesting differences in clustering remain on scales r ! 2rvir , as Figure 3.5 shows. Especially in the AGN simulation, subhaloes tend to be ∼ 10% less clustered at r ∼ rvir . A very small increase in clustering (∼ 1%) can be seen on slightly larger scales, r ∼ 3 − 4rvir . Both these differences could be explained by the combination of the greater dynamical friction initially experienced by dark matter only subhaloes, together with the delayed infall of baryonic subhaloes. We plan to investigate these effects further in a follow-up paper where we consider the differences in the satellite profiles due to galaxy formation. Note that small changes in the simulation code (such as changing the level of optimisation when compiling the simulation code) can shift the positions of satellite galaxies and subhaloes by small amounts, even if we start from identical initial conditions.6 However, as almost all these shifts are random, they average out for two-point statistics. Shifts due to dynamical friction and similar effects acting on satellites are the exceptions, as these tend to systematically move satellite subhaloes closer to their respective centrals. 3.3.3.2 Cross-correlation with matter Finally, we consider what remains of the baryonic effects on the subhalo-matter cross-correlation function after accounting for the change in the masses of subhaloes. Here, too, we show 1σ errors in all panels, now derived from 10000 bootstrap resamplings. In many cases, the errors are smaller than the widths of the lines. Comparing the bottom panels of Figure 3.6 to those of Figure 3.4, we see that while the large-scale offset is now completely removed, we are left with a non-negligible effect on scales r ! 1 h−1 Mpc for all subhalo masses. This again shows the strong effect that feedback can have on the mass distribution: both supernova and AGN feedback move matter to large scales, decreasing ξgm (r). We see that, especially when AGN feedback is included, this can significantly affect clustering out to several times the virial radius, which matches the findings of van Daalen et al. (2011) and Velliscig et al. (2014). Note that this also confirms that the findings of van Daalen et al. (2011), namely that AGN feedback decreases the matter power spectrum at the 1 − 10% level out to extremely large scales 6 The rms shift in position for subhaloes between DMONLY and AGN is about 0.04 rvir . Similar values are found for shifts between subhaloes in DMONLY and REF. 69 Baryons and the two-point correlation function Figure 3.6: As Figure 3.4, but now only showing the cross-correlation functions between matter and subhaloes that have been linked between a baryonic simulation and DMONLY, and that have been selected based on their mass in the latter. Relative to Figure 3.4, this procedure removes the effects of changes in the subhalo masses, leaving only the effect on the mass profiles and the changes in the positions of the subhaloes. As can be seen from the bottom panel, the change of the mass profile tends to increase the clustering on the very smallest scales (where baryons cool to), but decreases it on intermediate scales (where baryons are evacuated). The latter effect is stronger when AGN feedback is included, and significant over a larger range of scales, for all masses. Shaded areas indicate the regions allowed by 1σ bootstrap errors, which are typically much smaller than the widths of the lines. 70 3.4 Summary (r ∼ 10 h−1 Mpc), are caused by the effect (in Fourier space) of a systematic change in the profile of haloes, rather than by AGN somehow having a significant effect the mass distribution out to more than 10 times the virial radius of the haloes they occupy. There are strong similarities between the relative differences that remain for ξsm and the relative differences of halo profiles shown in Velliscig et al. (2014) for the same models, leaving no doubts as to the origin of the signal we see here. The strength of the baryonic effect decreases with increasing mass, but is still highly significant at the mass scales of groups and clusters, although it does not extend beyond the virial radius for the highest-mass subhaloes. The lowest-mass subhaloes we consider here experience a maximum decrease in the cross-correlation with matter of 30%, relative to a dark matter only scenario, and even the most massive subhaloes are 10% less strongly clustered with the matter distribution around r = 100 h−1 kpc when AGN are included. On the smallest scales, the increased clustering due to the cooling of baryons still dominates. Note also that the small-scale differences that we found in Figure 3.4 between REF and AGN remain. These results show us that assigning subhaloes in a dark matter only simulation the masses they would have had if galaxy formation and efficient feedback had been included, allows one to obtain the correct clustering predictions on scales r + 1 h−1 Mpc. However, on smaller scales one cannot correctly predict the crosscorrelation with matter, and hence the galaxy-galaxy lensing signal, to better than ∼ 10% accuracy without taking into account the change in the mass distribution. 3.4 Summary In this work we investigated how the galaxy and subhalo two-point autocorrelation functions and the cross-correlations with the matter, a measure of the galaxygalaxy lensing signal, are modified by processes associated with galaxy formation. We utilised a set of cosmological, hydrodynamical simulations with models from the OWLS project, run with more particles and an updated cosmology relative to previous OWLS simulations, to examine what the combined effects on the autoand cross-correlation functions are of adding baryons and radiative (metal-line) cooling, star formation, chemical enrichment and supernova feedback to a dark matter only simulation, as well as the further addition of a prescription of AGN feedback that reproduces observations of groups and clusters. As nearly all clustering models employed in the literature rely on pure dark matter distributions, either from N-body simulations or halo model type prescriptions, it is important to quantify just how important the effects of baryons and galaxy formation are. Our findings can be summarised as follows: • The stellar masses of galaxies are strongly decreased by (AGN) feedback at fixed subhalo mass, which in turn tends to greatly increase the clustering of galaxies at fixed stellar mass. More importantly for semi-analytical and 71 Baryons and the two-point correlation function halo models, the masses of subhaloes are also significantly decreased by the effects of feedback, the result of which is an increase in clustering of ∼ 10% on scales r + 1 h−1 Mpc, for the full range of subhalo masses considered here (Msh,tot = 1011 − 1015.5 h−1 M( ). This effect is much stronger on smaller scales. • Both the change in subhalo mass and the modified subhalo profiles act to change the subhalo-matter cross-correlation function by ∼ 5% on large scales, and significantly more on sub-Mpc scales. The modulation of the signal is strongly non-monotonic and mass-dependent, with both significant increases and decreases in clustering on different scales. We used the identical initial conditions of our simulations to link each baryonic subhalo with its dark matter only counterpart, allowing us to effectively exclude the effect of galaxy formation on the change in the masses of these objects. Nearly all subhaloes are successfully matched in this way. • While accounting for the change in mass of subhaloes removes essentially all of the baryonic effects on the autocorrelation of subhaloes on scales r + rvir , deviations ∼ 10% remain on scales r ! 2rvir , where rvir is the virial radius of the subhalo. We argued that these deviations are mainly caused by the differences in the dynamics of satellites, such as the initially greater dynamical friction experienced by the more massive, recently accreted pure dark matter satellites, and the increased concentration of baryonic subhaloes. • Finally, on scales r ! 1 h−1 Mpc strong deviations in the subhalo-matter cross-correlation function remain after accounting for the change in the masses of subhaloes. While on galactic scales (! 10 h−1 kpc) the clustering of subhaloes with matter is always much higher in a baryonic simulation than in the corresponding dark matter only simulation, the inclusion of baryons results in a significant decrease of the cross-correlation for r " 10 h−1 kpc. These effects are stronger for lower-mass subhaloes, reaching up to 30% for subhaloes with masses 1011 < Msh,tot /[M( /h] < 1012 . When AGN feedback is included, ξsm decreases by ∼ 10% relative to a dark matter only simulation for r ∼ 102 h−1 kpc, even for subhalo masses Msh,tot > 1014 h−1 M( . Mass- and radius-dependent rescalings of halo profiles which extend to several times the virial radius would be needed to account for this effect in dark matter only simulations. We note that while many of our results rely on a model that includes AGN feedback, other feedback processes may have similar effects on clustering. In principle, any other mechanism that is also effective at high masses, sufficiently reducing the stellar masses of massive galaxies, and allows one to reproduce the global properties of groups and clusters, may show similar effects to those shown here for AGN feedback. For example, a model in which a top-heavy IMF is used in high-pressure 72 3.A Convergence tests environments, such as the OWLS model DBLIMF, may have the same qualitative effect on clustering (see e.g. van Daalen et al., 2011). We stress that while the effects discussed in this chapter will certainly need to be modelled in order to achieve the accuracy needed to interpret upcoming cosmological data sets to their full potential, both our knowledge of the relevant physics involved and the currently achievable resolution in cosmological simulations still allow for significant uncertainty in the clustering measures discussed here. The same holds for quantities such as the halo or cluster mass function: much work is yet to be done before we can converge on a realistic prescription of galaxy formation, with uncertainties small enough to match observations in the era of precision cosmology. Although approaches based on dark matter only models, such as semi-analytical modelling or halo occupation distributions, are able to match the observed galaxy mass function, our results imply that their predictions for galaxy-galaxy and galaxy-mass clustering will have errors greater than 10% on sub-Mpc scales, unless the simulation results are modified to correctly account for the effects of baryons on the distributions of mass and satellites. Acknowledgements The authors thank Raul Angulo, Marcello Cacciato and Simon White for useful comments and discussions, and Marcello Cacciato also for comments on the manuscript. We would also like to thank the anonymous referee for suggestions that improved the paper. The simulations presented here were run on the Cosmology Machine at the Institute for Computational Cosmology in Durham (which is part of the DiRAC Facility jointly funded by STFC, the Large Facilities Capital Fund of BIS, and Durham University) as part of the Virgo Consortium research programme. This work was sponsored by the Dutch National Computing Facilities Foundation (NCF) for the use of supercomputer facilities, with financial support from the Netherlands Organization for Scientific Research (NWO). We also gratefully acknowledge support from the European Research Council under the European Union’s Seventh Framework Programme (FP7/2007-2013) / ERC Grant agreement 278594-GasAroundGalaxies and from the Marie Curie Training Network CosmoComp (PITN-GA-2009- 238356). 3.A Convergence tests Here we investigate the effects of changing the box size or resolution of the simulations used in this chapter on the subhalo autocorrelation function, as this is the main focus of this chapter. We will also briefly discuss the effects on the subhalo-matter cross-correlation function. In Figure 3.7 we show the subhalo autocorrelation functions for models DMONLY and REF. For clarity the correlation functions for the AGN model are not shown, but the results are very similar. Contrary to what was done for the figures in the 73 Baryons and the two-point correlation function Figure 3.7: The relative differences in the subhalo autocorrelation functions between models DMONLY and REF, split by subhalo mass as indicated in the top left of each panel. Contrary to the plots shown in other sections, no minimum number of pairs per bin is imposed. The box sizes and particle numbers, as well as the subhalo numbers for DMONLY and REF, respectively, are indicated in the legend, and a vertical dotted line indicates the mean virial radius in each mass bin. At fixed resolution (same line style) very little changes, although the effect of the better statistics offered by a larger volume are apparent. At fixed box size (same colour) the results are also very similar, except for the lowest mass bin, where the small-scale clustering is resolution dependent. Note that all simulations show excellent agreement for 1012 < Msh,tot /[M& /h] < 1013 , where neither resolution nor volume is an issue. 74 Figure 3.8: As Figure 3.7, but now only showing the autocorrelation functions for subhaloes linked between REF and DMONLY, and selected based on their mass in the latter. The convergence here is very similar to that seen in Figure 3.7. 3.A Convergence tests 75 Baryons and the two-point correlation function main text, here we do not impose a minimum number of pairs per bin. We vary both the box size and particle number in a systematic way: for simulations shown with the same line style (either solid or dashed) we vary the box size at fixed resolution, while for simulations shown with the same colour we vary the resolution at fixed box size. We first consider the effect of changing the size of the simulated volume. Looking at the solid and dashed lines separately, we can see that very little changes at fixed resolution, except that the results clearly benefit from the better statistics offered by a larger volume. This is noticeable both for the rare high-mass objects, on any scale, and for low-mass objects on the very smallest scales, where very few pairs are found. If we instead consider each colour of Figure 3.7 separately, we see that at fixed box size the results are also very similar. The exception is the lowest mass bin, 1011 < Msh,tot /[M( /h] < 1012 , where the correlation function is clearly resolution dependent when baryons are included. This is because these subhaloes contain only ∼ 102 particles in the low-resolution simulations, which is not quite enough for convergence, especially when feedback processes are included. We have verified that the subhalo mass functions of the highest-resolution simulations shown here are indeed converged using simulations with smaller volumes and higher resolutions (not shown here). The results for the second mass bin on the other hand, 1012 < Msh,tot /[M( /h] < 1013 , are fully consistent between the different resolutions shown here. We have repeated these same resolution tests for the autocorrelation functions of linked subhaloes, shown in Figure 3.8. Here, too, we see that our results are converged for Msh,tot > 1012 h−1 M( . Based on these tests, we choose to use the higher-resolution L200N1024 simulations for subhaloes with masses 1011 < Msh,tot /[M( /h] < 1013 , and take advantage of the better statistics offered by the L400N1024 simulations for subhalo masses Msh,tot > 1013 h−1 M( . Similarly, we opt to use the higher-resolution simulation for the autocorrelation function of galaxies with stellar masses 109 < M∗ /[M( /h] < 1011 , and the larger-volume simulation for galaxies with M∗ > 1011 h−1 M( . We also verified that the cross-correlation functions shown in this work are sufficiently converged (not shown). For the subhalo-matter (and galaxy-matter) cross-correlation functions, statistics are less of an issue, as the number of particles is the same for the L200 and L400 simulations. In other words, while for the 2 autocorrelation functions the number of pairs scales as Nobj , the number of pairs for the cross-correlation functions scales as Nobj Npart , where Npart + Nobj . Resolution is still an issue, however: while simulations including baryons always show stronger clustering on galaxy scales than DMONLY, the exact scale on which the transition of a relative increase to a relative decrease in clustering occurs depends somewhat on the softening length. Additionally, as we discussed briefly in §3.3.1, the effect of AGN feedback is resolution-dependent in our simulations, due to the fact that seed black holes can only be inserted in resolved haloes. AGN feedback 76 3.B Linked fractions may therefore be weaker at the L400 resolution than at the L200 resolution, while the strength of the feedback in the latter was deemed realistic. We therefore choose to use the L200 simulations at all masses when considering the cross-correlation functions ξgm and ξsm , valuing resolution over volume. 3.B Linked fractions Here we consider the fraction of subhaloes for which a link can be established between DMONLY and REF as a function of both mass and, in the case of satellites, radius. Both numerical and physical effects play a role here. First, at small radii subfind may fail to detect satellite subhaloes even though these have not been fully disrupted yet, due to the high background density of the main halo (e.g. Muldrew, Pearce & Power, 2011). As baryonic subhaloes are typically more concentrated than dark matter only ones, increasing their density contrast, these can be detected down to smaller radii. Second, baryonic satellites tend to be survive longer than their dark matter only counterparts, as their increased concentration also allows them to better withstand the tidal forces of the main halo (e.g. Macciò et al., 2006). Because of this, our results for linked samples may be biased at radii where a significant fraction of satellite subhaloes is unlinked, as we expect to be better able to detect a pair of identical subhaloes when the baryonic one is located at smaller radii than the dark matter only one, relative to a situation in which the dark matter only satellite is located at smaller radii than its baryonic counterpart. In Figure 3.9 we show the fraction of subhaloes in DMONLY for which a counterpart is found in REF. Once again we do not show a comparison with AGN for clarity, but note that very similar results are obtained. Horizontal lines show the total fraction of DMONLY subhaloes (both centrals and satellites) that is recovered in REF, while lines with plot symbols show the fraction of satellites for which a link is found as a function of radius. It is clear that the linked fraction depends heavily on both box size and resolution for Msh,tot < 1012 h−1 M( , although the effect of the box size is only significant for the lowresolution simulations. For the simulation employed in this mass bin throughout the main text of the chapter, L200N1024, the total fraction of linked subhaloes is around 98%. However, the fraction of linked satellites is significantly lower, especially for radii r ! 2rvir , where the different survival and detection rates of baryonic subhaloes are expected to play a role. Comparing this panel to the corresponding panel in Figure 3.8, we see that the drop in the fraction of matched satellites at small radii corresponds to the strong increase in clustering found for baryonic subhaloes, indicating that this may be a biased result. Similar results are found for satellites with masses 1012 < Msh,tot /[M( /h] < 1013 , although both the total and the satellite linked fractions are much higher than for 1012 < Msh,tot /[M( /h] < 1013 , for all simulations and radii. No drop-off in the linked fraction of satellites is observed at higher masses. 77 Baryons and the two-point correlation function Figure 3.9: The fraction of subhaloes in DMONLY for which a link was found in REF, split by subhalo mass as indicated in the top left of each panel. Colours and line styles are as in Figure 3.7, and a vertical dotted line once again indicates the mean virial radius in each mass bin. Horizontal lines show the total fraction of linked subhaloes (both centrals and satellites) at the corresponding box size and resolution, while the lines with plot symbols show the fraction of satellite subhaloes linked as a function of radius. For r ! 2rvir the fraction of linked satellites typically drops sharply as subhaloes are destroyed by tidal stripping or become undetectable. Both the matched satellite and total fractions depends strongly on box size and resolution for subhalo masses Msh,tot < 1012 h−1 M& . 78 3.B Linked fractions Based on these results, we haven chosen to grey out the relative difference curves in the Figures showing autocorrelation functions (Figures 3.3 and 3.5) on radii where the fraction of linked satellites is < 95% of the total matched fraction. Note that this may not completely remove the possible bias on scales where the satellite contribution dominates the correlation function. Further investigation with higher-resolution simulations is needed to determine whether the upturn observed at small radii is physical or numerical in origin. Note that the occasional downturn of the linked fraction at relatively large radii, r " 2rvir , is due to small-number statistics, as low-mass subhaloes found at these radii are rarely satellites. As the autocorrelation function of linked subhaloes at these radii is dominated by central-central pairs, we do not apply a cut at r ≥ 2rvir . 79 4 The effects of halo alignment and shape on the clustering of galaxies We investigate the effects of halo shape and its alignment with larger scale structure on the galaxy correlation function. We base our analysis on the galaxy formation models of Guo et al., run on the Millennium Simulations. We quantify the importance of these effects by randomizing the angular positions of satellite galaxies within haloes, either coherently or individually, while keeping the distance to their respective central galaxies fixed. We find that the effect of disrupting the alignment with larger scale structure is a ∼ 2 per cent decrease in the galaxy correlation function around r ≈ 1.8 h−1 Mpc. We find that sphericalizing the ellipsoidal distributions of galaxies within haloes decreases the correlation function by up to 20 per cent for r ! 1 h−1 Mpc and increases it slightly at somewhat larger radii. Similar results apply to power spectra and redshift-space correlation functions. Models based on the Halo Occupation Distribution, which place galaxies spherically within haloes according to a mean radial profile, will therefore significantly underestimate the clustering on sub-Mpc scales. In addition, we find that halo assembly bias, in particular the dependence of clustering on halo shape, propagates to the clustering of galaxies. We predict that this aspect of assembly bias should be observable through the use of extensive group catalogues. Marcel P. van Daalen, Raul E. Angulo and Simon D. M. White Monthly Notices of the Royal Astronomical Society Volume 424, Issue 4, pp. 2954-2960 (2012) The effects of halo shape on clustering 4.1 Introduction Investigating how matter is organized in our Universe is one of the key ways in which we can test the validity of cosmological models and constrain their parameters. By comparing theoretical predictions to observed measures of structure, such as the galaxy correlation function or the matter power spectrum, one can reject some models and fine-tune others. It is, however, important to keep in mind the limitations of theoretical models, both numerical and analytical, when making this comparison, as these may limit the applicability of the results. There are various ways in which one can predict the organization, or “clustering”, of matter and galaxies theoretically. One can use fully hydrodynamical simulations, in which dark matter, gas and stars are treated explicitly, to follow the formation and evolution both of dark matter haloes and of the galaxies within them. For a recent review of the numerical methods behind such simulations, see Springel (2010). Such models are computationally expensive, limited to small volumes in comparison to recent galaxy surveys, and sensitive to the ad hoc subgrid recipes required to include critical processes like star formation and feedback. An alternative, first implemented by Kauffmann et al. (1999) (see also Springel et al., 2001; Springel, 2005a), is to combine N-body simulations of the growth of dark matter structures with semi-analytic models of galaxy formation (e.g. White & Frenk, 1991; Kauffmann, White & Guiderdoni, 1993; Cole et al., 1994; see Baugh, 2006 for a review). A great advantage of semi-analytic simulations is that they require comparatively little CPU time even for a large underlying N-body simulation. This allows them to be run many times and on many haloes, so that one can explore the physical processes and the associated parameters that are required to produce galaxy populations in agreement with selected observational data (such as the galaxy stellar mass, luminosity or correlation functions). Such semi-analytic simulations do not focus on the properties of individual objects, but rather on the underlying statistical properties of the entire population. In this way, the relative importance of different physical processes can be examined as a function of the time and place where they are occurring. A disadvantage of such simulations is that they provide only very crude information on the structure of individual objects. Yet another alternative is to take the statistical approach one step further. If one is interested only in the present-day clustering of galaxies, the physical processes associated with their formation and evolution may not be relevant. One can then populate the haloes in an N-body simulation with galaxies using a purely statistical model that depends on current halo properties, for example halo mass. Galaxy clustering can then be described in terms of the clustering of the haloes. This approach is known as halo occupation distribution modelling, or simply HOD modelling (see Cooray & Sheth, 2002, for a review). Typically, central and satellite galaxies are treated separately, as each halo will contain one and only one of the former but may contain none or many of the latter (Kauffmann et al., 1999; Kravtsov et al., 2004; Zheng et al., 2005). The satellite galaxies assigned to a halo 82 4.1 Introduction are usually assumed to be spherically distributed following a standard profile such as that of Navarro, Frenk, & White (1997). Attempts at including substructure or an environmental dependence have also been made (e.g. Giocoli et al., 2010; Gil-Marín, Jimenez & Verde, 2011). Note that by assuming spherical symmetry some information is lost. As the simulations of Davis et al. (1985) first showed, cold dark matter haloes are typically strongly ellipsoidal. If the distribution of galaxies follows the mass distribution, this would leave an imprint on the galaxy correlation function on small scales. Furthermore, halo ellipticity may also have an effect on larger scales. If neighbouring haloes are aligned, as expected from tidal-torque theory, this will boost the correlation on scales corresponding to the typical separations between haloes. The ellipticity and intrinsic alignment of dark matter haloes and their galaxy populations have been the subject of many earlier studies, e.g. Carter & Metcalfe (1980); Binggeli (1982); West (1989); Splinter et al. (1997); Jing & Suto (2002) and Bailin & Steinmetz (2005), and recently Paz et al. (2011) and Smargon et al. (2012). Most relevant to the current work are the studies by Smith & Watts (2005) and Zu et al. (2008). The former authors investigated the effects of halo triaxiality and alignment on the matter power spectrum in the halo model framework. Inspired by the results of simulations, they took a purely analytic approach in which they re-developed the halo model to account for ellipsoidal halo shapes. Zu et al. (2008), on the other hand, used the semi-analytical models of De Lucia & Blaizot (2007) to investigate environmental effects, including that of halo ellipticity, on the galaxy correlation function in both real and redshift space. There exists a deeper connection between halo shape and clustering that we also explore in this chapter. As Bett et al. (2007) and Faltenbacher & White (2010) have previously shown, at fixed mass the clustering of haloes depends on their shape. This is probably a reflection of assembly bias (i.e. the dependence of halo clustering on properties other than mass, Gao & White, 2007). More specifically, the most spherical haloes in their samples cluster significantly more strongly than average, and the most aspherical more weakly. The known correlations between formation time and halo shape (e.g. Allgood et al., 2006; Ragone-Figueroa et al., 2010), and between formation time and clustering strength (e.g. Gao, Springel & White, 2005; Wechsler et al., 2006; Wetzel et al., 2007; Jing, Suto & Mo, 2007) point in this direction but may by themselves not be strong enough to explain the magnitude of the effect. Here, we investigate whether this shape-dependence is also seen in the clustering of the galaxies. If so, this would bring us one step closer to measuring assembly bias directly in observations. We are herein also motivated by the results from Zhu et al. (2006) and Croton, Gao & White (2007), who showed that assembly bias in general is indeed expected to propagate to galaxy clustering. In this chapter, we expand upon previous work by investigating the effects of alignment and ellipticity on the galaxy correlation function using the Millennium Simulation (Springel, 2005a) and the semi-analytic models of Guo et al. (2011). In Section 4.2, we discuss these simulations and our methods for quantifying the effects of alignment and ellipticity. We also outline our procedure for determining 83 The effects of halo shape on clustering the shape-dependence of galaxy bias. We show our results in Section 4.3, and present our conclusions in Section 4.4. 4.2 Methods 4.2.1 Simulation and SAM We make use of the galaxy catalogues generated by Guo et al. (2011, , hereafter G11), who implemented galaxy formation models on the Millennium Simulations (Springel, 2005a; Boylan-Kolchin et al., 2009). The Millennium Simulation (MS) is a very large cosmological N-body simulation in which 21603 particles were traced from redshift 127 to the present day in a periodic box of side 500 h−1 Mpc, comoving. The Millennium-II Simulation (MS-II) follows the same number of particles in a box of side 100 h−1 Mpc and so has 125 times better mass resolution. Both simulations assume a ΛCDM cosmology with parameters based on a combined analysis of the 2dFGRS (Colless et al., 2001) and the first-year WMAP data (Spergel et al., 2003). These cosmological parameters, given by {Ωm , Ωb , ΩΛ , σ8 , ns , h} = {0.25, 0.045, 0.75, 0.9, 1.0, 0.73}, are not consistent with the latest analyses of the CMB data, for example the seven-year WMAP results (Komatsu et al., 2011). In particular, the more recent data prefer lower σ8 and higher Ωm values.1 We will only make relative comparisons between clustering statistics here, and do not expect our results to be significantly influenced by these small parameter differences (see Guo et al., 2013). The galaxy formation models of G11 allow galaxies to grow at the potential minima of the evolving population of haloes and subhaloes in the simulations. Each Friends-of-Friends (FoF) group contains a central galaxy at the potential minimum of its main subhalo, and may contain many satellite galaxies at the centres of surrounding subhaloes. In some cases, due to tidal effects, a satellite galaxy may be stripped of its dark matter to the point where its subhalo is no longer identified as a bound substructure, turning the galaxy into an “orphan”. Such galaxies follow the orbit of the dark matter particle that had the highest binding energy immediately before subhalo disruption, except that their distance to the central galaxy is artificially decreased until they merge with it in order to mimic the effects of dynamical friction. We note that the treatment of the orbits of orphans is approximate, and for example does not include the expected circularization of the orbits (e.g. Boylan-Kolchin, Ma & Quataert, 2008). The models also include treatments of star formation, gas cooling, gas stripping, metal enrichment, supernova and AGN feedback, and galaxy mergers. For more details about the SAM, as well as the treatment of different types of galaxies, we refer to G11. For our purposes, it is enough to note that the predicted clustering of galaxies is quite a close match to that seen in the Sloan Digital Sky Survey (Guo et al., 2011). 1 See 84 Angulo & White (2010a) for a method to correct for this. 4.2.2 Calculation of the galaxy correlation function 4.2.2 Calculation of the galaxy correlation function The galaxy two-point correlation function, ξ(r), measures the clustering of galaxies as a function of scale. It effectively encodes the excess probability of finding a pair of galaxies at a given separation r, relative to the expectation for a uniform random distribution. In what follows, we will be interested in scales 30 h−1 kpc < r < 50 h−1 Mpc, as these are both well-resolved and well-sampled by the simulation. In order to get accurate results over this full range, we calculate the correlation function by direct pair counts on small scales (i.e. r ! 4 h−1 Mpc) and use an approximate but accurate method to calculate it on intermediate and large scales. A direct calculation of this function scales as the number of galaxies squared and is thus unfeasible for the large sample analysed here. We therefore speed up the calculation by mapping galaxies onto a grid, and we correlate the mean density contrast in each grid cell with that of every other (a method previously employed by, for example, Barriga & Gaztañaga, 2002, Eriksen et al., 2004 and Sánchez, Baugh & Angulo, 2008). We improved the performance on intermediate scales by folding the density field onto itself before its autocorrelation is calculated (see e.g. Jenkins et al., 1998). We do not go into these methods here, but note that tests against higher-accuracy calculations show that the error in the ratio of the correlation functions, which is the relevant quantity for our main results, is less than 1 per cent on all scales considered. For the correlation functions we calculate to determine the galaxy bias a direct pair count over the full range of scales is feasible, as there we only consider relatively small subsets of galaxies (see §4.3.2). 4.2.3 Testing the importance of alignment and ellipticity During their lifetime, haloes merge and may accrete more subhaloes. The accretion of mass is not isotropic since matter flows in preferentially along filaments (see e.g. Tormen, Bouchet & White, 1997; Colberg et al., 1999 or more recently Vera-Ciro et al., 2011). As a result, the distribution of subhaloes and thus galaxies within a FoF group is generally not isotropic either, but is instead approximately ellipsoidal, following the mass and aligning with surrounding large-scale structure (see e.g. Angulo et al., 2009). To test whether alignment with neighbouring structure has an effect on clustering statistics, we randomly rotate the haloes around their centres and see if this systematically alters the galaxy correlation function. More precisely, we rotate the satellite population of each FoF group bodily around the central galaxy to a new randomly chosen orientation, and we repeat this process for every FoF group in the simulation. We stress that this transformation preserves the numbers, properties, and relative positions of the galaxies in every halo; only the orientations of the distributions change. We then calculate the galaxy correlation function for the new distribution, and compare it to the original. If alignment with large-scale structure is important, one would expect to see the correlation decrease systematically on scales slightly larger than individual haloes. To estimate the uncertainty in our results, we have repeated this process 25 times, each time with 85 The effects of halo shape on clustering a different set of randomly chosen angles. The effect of halo ellipticity is tested in a similar way. Here, we randomly rotate the position of each individual satellite galaxy around its central, rather than rotating all satellites together. In this way, the galaxy distribution within each halo is sphericalized. Since the distribution of galaxies within haloes is typically ellipsoidal, this process should increase the average distance between galaxies, thus decreasing the correlations between galaxies in the same halo. We note that Zu et al. (2008) investigated the effect of halo ellipticity in the same way. 4.2.4 Testing the dependence of galaxy bias on halo shape Faltenbacher & White (2010) showed that the clustering of haloes depends on the shape of the halo, defined as s = c/a, where c and a are eigenvalues of the inertia tensor (a > b > c). Specifically, they showed that the large-scale bias of haloes with more spherical shapes is larger than average, while the inverse is true for the most aspherical haloes. They also found that this difference decreases with equivalent peak height, ν(M, z) = δc (z)/σ(M, z), where σ(M, z) is the root-meansquare linear overdensity within a sphere which contains the mass M in the mean, and δc (z) is the linear overdensity threshold for collapse at redshift z. Here, we are interested in seeing if this shape-dependent clustering, which might reflect the assembly bias of the haloes, is also recovered from the galaxy distribution. We use the halo shape data from Faltenbacher & White (2010), who calculated the inertia tensor from the dark matter particles belonging to the mostpronounced subhalo2 of each FoF halo, which on average comprises ∼ 80 per cent of its mass. To ensure that the shapes were accurately determined, only haloes with at least 700 particles were considered, corresponding to a minimum (sub)halo mass M = 6.02 × 1011 h−1 M( . We compare this to the shape measured from the galaxy distribution in the same way, using all galaxies with stellar masses M∗ > 109 h−1 M( within a sphere of radius R200 , defined as the radius enclosing 200 times the mean density of the Universe, centred on the central galaxy. Using such a distance cut makes it easier to compare our results to observations – in fact, a similar procedure is often followed when determining the richness of real groups and clusters. Rejecting galaxies outside the virial radius slightly biases us to measure more spherical shapes, but we have checked that this effect is small and does not significantly affect our results. After splitting the galaxies by the shape of their halo (measured either from the dark matter or from the galaxies themselves) we determine the large-scale galaxy bias factor bgal for each subsample. Here, too, we follow Faltenbacher & White (2010), who in turn followed the approach of Gao & White (2007). The bias is 2 The most-pronounced subhalo may differ from the most massive subhalo only if the FoF groups hosts two or more subhaloes with roughly the same mass. In this case one of these is arbitrarily assigned to be the most massive. The most-pronounced subhalo is more consistently defined by using the halo merger tree. It is also the subhalo hosting the most luminous galaxy in the FoF group. 86 4.3 Results Figure 4.1: The effect of halo alignment on the galaxy correlation function. The x-axis shows the real-space separation r, while the y-axis shows the fractional difference between the correlation function after random bodily rotations are applied, and that of the original, unrotated sample. All simulated galaxies with M∗ > 109 h−1 M& from the z = 0 catalogue of G11 have been used here. The bin size is roughly 0.07 dex. Each of the 25 thin, coloured lines represents a different set of random rotations, and the thick, black line shows the average of these. There is a clear signal around r ≈ 1.8 h−1 Mpc, where the correlation function is lowered by roughly 2 per cent. computed as the relative normalization factor that minimizes the mean square of the difference log(ξgm ) − log(bgal ξmm ) for four bins spaced equally in log r in the range 6 < r < 20 h−1 Mpc. Here ξmm is the dark matter autocorrelation function and ξgm is the cross-correlation function of galaxies and dark matter. Note that unlike Faltenbacher & White (2010) we are only interested in the results at z = 0. 4.3 Results 4.3.1 Alignment and ellipticity We will first discuss our results for “bodily” rotations, which test the effect of halo alignment. Figure 4.1 shows the fractional difference between the correlation functions of the “rotated” and original samples, plotted against the real-space separation r. We have only used those galaxies from the catalogue generated by G11 that have a stellar mass M∗ > 109 h−1 M( , as the Millennium Simulation is not complete below this limit. This provides a sample of 5 200 801 galaxies. We note that increasing this mass limit by a factor of ten does not influence our results, either qualitatively or quantitatively. All random rotations are applied prior to the mass cut in order to avoid problems in cases where a central galaxy 87 The effects of halo shape on clustering Figure 4.2: The effect of halo ellipticity on the galaxy correlation function. The bin size, axes and lines are as in Figure 4.1, although the bodily rotations have been replaced by random independent rotations of satellites around their central galaxies. A peak of 1 − 2 per cent can be seen around r ≈ 3.5 h−1 Mpc, but the largest effect is seen on small scales, where the correlation function is systematically lowered by up to ∼ 20 per cent. Note also that the scatter has been greatly reduced relative to Figure 4.1. This is mainly due to the larger number of random rotations used when rotating satellites separately. below the limiting mass has satellites above it. Coloured lines indicate different sets of rotations, while the thick, black line shows the average of these. There is clearly a significant dip around r ≈ 2 h−1 Mpc, with a depth of 2 per cent. This is due to the disruption of the alignment between haloes and surrounding structure. Note that the scatter is extremely low, due to the large number of objects (in fact, the uncertainty at large scales is dominated by the errors due to our approximate calculation of the correlation). Neglecting the orientation of haloes when populating them with galaxies will therefore have a modest, but significant, effect on the derived correlation function. The ellipsoidal shape of the haloes, and the fact that the galaxy distribution follows this shape, is a more significant factor when modelling the galaxy distribution. Figure 4.2 shows the result of applying independent rotations, which sphericalize the galaxy distributions within haloes. This substantially suppresses correlations for r ! 2 h−1 Mpc, with a ∼ 20 per cent effect on the smallest scales probed here. The ellipsoidal shape of the galaxy distribution within haloes significantly reduces the typical separations of pairs within them. This is compensated by a 1 − 2 per cent stronger correlation around r ≈ 3.5 h−1 Mpc. We conclude that models that assume spherical profiles for the distribution of galaxies within haloes will underestimate the galaxy correlation function by up to ∼ 20 per cent, depending on the smallest scale considered. Note that the scatter is even smaller 88 Figure 4.3: Same as Figures 4.1 and 4.2, but now for the fractional differences in the galaxy power spectrum versus the wave number k. Left: Result when testing for alignment. Again a weak but systematic signal of a few per cent can be seen, now between k ≈ 0.1 h Mpc−1 and k ≈ 2 h Mpc−1 . Right: Result when testing for ellipticity. A monotonic decline in power sets in at k ≈ 0.1 h Mpc−1 , reaching roughly 20 per cent at k = 30 h Mpc−1 , matching the result found in Figure 4.2. This again demonstrates the importance of taking the ellipsoidal shape of the galaxy distribution within haloes into account. 4.3.1 Alignment and ellipticity 89 The effects of halo shape on clustering than before on all scales, which is due to the increased number of degrees of freedom here. This result is in excellent agreement with that of Zu et al. (2008), who applied similar methods to all galaxies from De Lucia & Blaizot (2007) with r-band luminosities Mr < −19. One might worry that the necessarily artificial treatment of orphan, or “type 2”, galaxies in the galaxy formation models of G11 influences these results. G11 already showed that the inclusion of the orphans is critical if the radial distribution of galaxies within rich clusters in the Millennium Simulation is to agree both with observations and with the much higher-resolution MS-II. We have investigated effects on our analysis by repeating it with these galaxies removed, reducing our sample size by ∼ 24 per cent and spoiling the relatively good agreement of its small-scale correlation with observation. This removal significantly amplifies the signal found for the effects of alignment. This is because the orphan galaxies are primarily located near halo centres. Once they are removed, galaxies that do contribute to the alignment signal receive more weight. Orphan removal also changes the signal found for the effects of ellipticity, modestly boosting it down to r ≈ 0.1 h−1 Mpc. As a further check that the distribution of orphans in the simulation is realistic, we have examined the shapes of the galaxy distributions of massive haloes (specifically, 14 < log10 (MFoF /[ h−1 M( ]) < 14.5) in the MS and MS-II, again using G11’s galaxy catalogues and considering only galaxies with stellar masses M∗ > 109 h−1 M( . The better mass resolution of the MS-II results in far fewer orphans in this mass range, and consequently the positions of galaxies in MS-II are determined more accurately. Nevertheless, the shapes of the galaxy distributions agree very well, thus implying that the distribution of orphans in the MS is consistent with the distribution of similar, but unstripped, galaxies in the MS-II. We also found that these shapes agree very well with those of the dark matter haloes themselves (see Bett et al., 2007). A more detailed discussion of the shapes is beyond the scope of this chapter. For completeness, we have also compared the galaxy-galaxy power spectra of the rotated and unrotated samples. The results are shown in Figure 4.3. Three foldings in total were used to calculate the power spectra over the full range shown, each with a fold factor of six (i.e. each folding maps the particle distribution to 1/216th of the volume). The power spectra were re-binned logarithmically to resemble the bins used for the galaxy correlation functions, and to reduce noise. Shot noise, which dominates the power for k " 10 h Mpc−1 , was subtracted. The left-hand figure shows the fractional differences that result from applying bodily rotations. Just as for the correlation function, there is a weak but clear dip of 1 − 2 per cent present which reflects the alignment of haloes with surrounding structure. As expected, the right-hand figure shows a much stronger decrease in power, up to 20 per cent on the smallest scales considered. This again demonstrates the importance of taking the ellipsoidal distribution of galaxies within FoF groups into account in, for example, models that use that use the full shape of the power spectrum to extract cosmological parameters. Our results differ from those found by Smith & Watts (2005). Like ourselves, 90 4.3.2 Shape-dependent galaxy bias Figure 4.4: A comparison of the halo shape measured from the dark matter and that measured from the galaxy distribution. Here shape is defined as the ratio of the smallest to the largest eigenvalue of the inertia tensor. The shape on the horizontal axis is computed using all galaxies satisfying M∗ > 109 h−1 M& within R200 . For the shape on the vertical axis, only CDM particles belonging to the most-pronounced substructure are included, and only if the number of particles is at least 700. Pixels are colour-coded by mean number of galaxies per halo. The “true” shape is recovered more accurately when the number of galaxies increases; some scatter remains however, as the galaxy distribution does not perfectly trace the main subhalo. they find that the scale at which the contribution from alignments to the power spectrum is maximal is ∼ 0.5 h Mpc−1 , but they show that the relative contribution of alignments is strongly model-dependent, varying from 10−12 to 10 per cent. Furthermore, they find that when haloes are assumed to be spherical, the power is higher than when they are ellipsoidal by up to 5 per cent. Not only is the effect we find significantly stronger and increasing towards smaller scales up to at least k = 30 h Mpc−1 , but its sign is opposite. We attribute these differences to the fact that in both models explored by Smith & Watts (2005), the radially averaged density profiles are not conserved when transforming the haloes from spherical to triaxial, making a comparison with our own results difficult. 4.3.2 Shape-dependent galaxy bias Having established that the shape of the galaxy distribution significantly affects the small-scale clustering, we now investigate how the shape-dependent assembly bias affects clustering. We first examine how well the shape measured from the 91 The effects of halo shape on clustering galaxy distribution corresponds to that measured from the dark matter particles belonging to the most-pronounced substructure. Note that while the galaxies trace the shape of the FoF halo very well, this is not necessarily the case for its mostpronounced substructure. Additionally, even if the galaxy distribution traces the dark matter perfectly, the right shape may still not be recovered if only a small number of galaxies is available to sample the halo. The results of this shape comparison are presented in Figure 4.4. Here the shape measured from the galaxy distribution is shown on the horizontal axis, while the shape measured from the dark matter is on the vertical axis. Each pixel is colour-coded by the mean number of satellites, Nsat , satisfying M∗ < 109 h−1 M( and |rsat −rcen | < R200 . A low value of s = c/a indicates that the halo is (measured to be) very aspherical, while a perfectly spherical halo would have s = 1. It is immediately clear that a low number of satellites leads to a severe underestimate of s. This is expected: together with the central galaxy, any two satellites will define a plane, ensuring that c = 0. It is only when the halo is sampled by a large enough ensemble of points that the inertia tensor can be determined accurately. Figure 4.4 illustrates that one needs Nsat " 30 to get an unbiased and accurate shape estimate. However, there is always a significant amount of scatter around the diagonal. This is due to the galaxies tracing the mass, i.e. the shape of the whole FoF group, and not just that of the most-pronounced substructure. Next, we split our galaxy sample by the shape measured from the halo dark matter distribution and calculate the galaxy bias factor as function of equivalent peak height, bgal (ν), following the method described in §4.2.4. As we can only use galaxies for which a dark matter shape has been determined – i.e. those in FoF haloes of which the most massive substructure is comprised of at least 700 particles – our galaxy sample is reduced to 2 953 050 galaxies. The results are shown in Figure 4.5. Here the upper panel shows the galaxy bias determined for each sample at a given equivalent peak height. The points show the median value, horizontal error bars indicate the width of the bin, and vertical error bars show 1σ deviations calculated from 50 bootstrap resamplings of the galaxy catalogues. The black line shows bgal (ν) for the full sample of galaxies, while the coloured lines show the bias for the different subsamples. In the bottom panel, the fractional difference between these subsamples and the full sample is shown. It is immediately clear that there is a strong dependence on shape: at the lowest equivalent peak heights probed here, the bias of the galaxies in the most spherical (aspherical) 20 per cent of haloes is up to 40 per cent higher (lower) than that of the full sample. This shows that the shape-dependence of the halo bias found by Bett et al. (2007) and Faltenbacher & White (2010) is also strongly present in the clustering of the galaxies. The effect grows weaker with increasing ν. However, for ν " 2.5 statistical uncertainties begin to dominate, due to the low number of haloes available at high equivalent peak heights. Note that this is not seen in the data of Faltenbacher & White (2010) as they combine the data from different redshifts, while we only consider z = 0. We then repeat this exercise, but this time we split our galaxy sample by 92 4.3.2 Shape-dependent galaxy bias Figure 4.5: The dependence of galaxy bias, bgal , on shape, as a function of peak height at z = 0. The black line in the top panel shows the galaxy bias of the full galaxy sample, while coloured lines show the bias of subsamples split by halo shape measured from the dark matter distribution. In the bottom panel the fractional differences of the bias of these subsamples relative to the full sample are shown. The vertical error bars show 1σ deviations calculated from 50 bootstrap resamplings of the galaxy catalogues. We find that galaxies in the most spherical (aspherical) haloes are strongly biased (antibiased) relative to the full sample. The difference can be as much as 40 per cent for ν ≈ 0.7. For ν " 2.5 statistical uncertainties, due to the low number of high-mass haloes, begin to play an important role. the shape measured from the distribution of the galaxies themselves. However, we expect low-mass haloes to host only a few satellite galaxies, leading to very unreliable estimates of the halo shape (see Figure 4.4). In order to separate the signal we are looking for – i.e. the shape-dependence of the galaxy bias – from the unwanted bias introduced by using too few galaxies in the shape measurement, we now consider bgal as a function of the number of satellites per halo, Nsat , instead of the equivalent peak height ν. An additional advantage of this approach is that Nsat is directly observable. We note, however, that we obtain almost identical results when considering galaxy bias as a function of peak height instead of Nsat . Since we do not use the dark matter particle data in this case, we are no longer constrained by needing haloes with at least 700 particles. However, as we can now only consider haloes with at least one satellite galaxy that satisfies both M∗ > 109 h−1 M( and |rsat − rcen| < R200 , we are left with a sample of 2 566 441 galaxies. The results are shown in Figure 4.6. At the lowest value of Nsat , no significant effect can be seen. But as Nsat grows, increasing the accuracy of the shape determinations, we again see a clear dependence of bgal on the shape s: galaxies in more spherical haloes 93 The effects of halo shape on clustering Figure 4.6: As Figure 4.5, but now showing the bias as a function of the number of satellite galaxies and split by the halo shape measured from galaxies within R200 . At low Nsat no significant shape-dependence of bgal is recovered, due to the extremely unreliable shape determinations that follow from using only a handful of galaxies to sample the halo. For 10 ! Nsat ! 400, however, we find again that galaxies in more spherical (aspherical) haloes have a significantly higher (lower) bias than average. At higher Nsat our results are again dominated by poor statistics. are ∼ 20 per cent more strongly clustered than average. The inverse is true for galaxies in the most aspherical haloes. When Nsat grows too high our results are once more dominated by statistical errors, due to the low number of high-mass haloes (hosting at least several hundreds of satellites) available. These results show that halo assembly bias in the form of a shape-dependent clustering strength propagates to the clustering of galaxies, and can therefore in principle be measured in sufficiently large surveys. In order to carry out such a task, a large galaxy survey with appropriately defined group catalogues is needed. 4.4 Summary We have investigated the effects of halo alignment with larger scale structure and of halo ellipticity on galaxy correlation functions, using the Millennium Simulations (Springel, 2005a; Boylan-Kolchin et al., 2009) and the galaxy formation models of Guo et al. (2011). By rotating satellite galaxies in FoF groups around their central galaxies, either coherently for each halo or independently for each satellite, and then comparing the correlation function of the resulting galaxy distribution to the original one, we were able to quantify the importance of taking halo alignment and 94 4.4 Summary non-sphericity into account. Furthermore, by measuring the shape of the haloes as traced by the galaxies we were able to investigate the propagation of shapedependent assembly bias to the clustering of galaxies. Only galaxies with stellar masses M∗ > 109 h−1 M( were considered in our analysis, though we note that increasing this mass limit by a factor of ten does not influence our results. Our findings can be summarized as follows: • The effects on the galaxy correlation function of the alignment of haloes with larger-scale structure are small. The main effect of disrupting this alignment is a 2 per cent reduction in correlation amplitude around r ≈ 1.8 h−1 Mpc, with minor effects of at most 1 per cent at smaller scales. • The ellipsoidal shapes of the galaxy distributions within individual haloes have a much stronger influence on galaxy correlations. By sphericalizing these galaxy distributions (i.e. randomizing the angular positions of satellites while keeping the distance from the central galaxy fixed), the correlation function is raised by up to 2 per cent around r ≈ 3.5 h−1 Mpc, but greatly reduced for r ! 1.5 h−1 Mpc, by up to ∼ 20 per cent on the smallest scale probed, r = 30 h−1 kpc. This confirms the results of Zu et al. (2008). The effect on the galaxy power spectrum extends to scales as large as k = 0.1 h Mpc−1 . • The assembly bias of haloes, as characterized by the dependence of clustering on halo shape, is reflected in the clustering of galaxies. The effect is strongest at low equivalent peak heights: at ν ≈ 0.7, the galaxy bias of galaxies in the 20 per cent most spherical and most aspherical haloes deviate from the average by 40 per cent. • Even if the shape of the halo cannot be measured directly, but is instead estimated from galaxies within one virial radius of the central galaxy, the effect of assembly bias is clearly visible. By using the plane-parallel approximation and ignoring evolution, we have checked that comparable results are obtained for the effects of alignment and ellipticity on the redshift-space correlation functions. Models that assume a spherically symmetric profile for the galaxy distribution, such as HOD models, will therefore significantly underestimate galaxy correlations and power spectra on sub-Mpc scales. Furthermore, we have demonstrated that the shapes of haloes and of the galaxy distributions within them can be strongly correlated with their clustering. This effect should be measurable in galaxy redshift surveys. With the help of extensive group catalogues it should therefore be possible to measure assembly bias directly. Acknowledgments The authors thank Andreas Faltenbacher for kindly providing dark matter shape determinations for haloes in the Millennium Simulation. The Millennium Sim95 The effects of halo shape on clustering ulation databases used in this chapter and the web application providing online access to them were constructed as part of the activities of the German Astrophysical Virtual Observatory. This work was supported by the Marie Curie Initial Training Network CosmoComp (PITN-GA-2009-238356) and by Advanced Grant 246797 "GALFORMOD" from the European Research Council. 96 5 The contributions of matter inside and outside of haloes to the matter power spectrum Halo-based models have been very successful in predicting the clustering of matter. However, the validity of the postulate that the clustering is fully determined by matter in haloes remains largely untested, and it is not clear a priori whether non-virialised matter might contribute significantly to the non-linear clustering signal. Here, we investigate the contribution of haloes to the matter power spectrum as a function of both scale and halo mass by combining a set of cosmological N-body simulations to calculate the contributions of different spherical overdensity regions, Friends-of-Friends groups and matter outside haloes to the power spectrum. We find that on scales k < 2 h Mpc−1 , matter inside spherical overdensity regions of size R200,mean accounts for less than 85% of the power, regardless of the minimum halo mass. Its relative contribution increases with increasing Fourier scale, peaking at ∼ 95% around k = 20 h Mpc−1 and on smaller scales remaining roughly constant. For 2 ! k ! 10 h Mpc−1 , haloes below ∼ 1011 h−1 M( provide a negligible contribution to the power spectrum, the dominant contribution on these scales being provided by haloes with masses M200 " 1013.5 h−1 M( , even though such haloes account for only ∼ 13% of the total mass. When haloes are taken to be regions of size R200,crit , the amount of power unaccounted for is larger on all scales. Accounting also for matter inside FoF groups but outside R200,mean increases the contribution of halo matter on all scales probed here by 5 − 15%. Matter inside FoF groups with MFoF > 109 h−1 M( accounts for essentially all power for 3 < k < 100 h Mpc−1 . We therefore expect a halo model based approach to overestimate the contribution of haloes of any mass to the power on small scales (k " 1 h Mpc−1 ), while ignoring the contribution of matter outside R200,mean , unless one takes the halo to be a broader non-spherical region similar to the FoF group. Marcel P. van Daalen and Joop Schaye In preparation Halo matter and the power spectrum 5.1 Introduction The matter power spectrum, a measure of how matter clusters as a function of scale, is a key observable and a powerful tool in determining the other cosmological parameters of our Universe. As future weak lensing experiments which will measure this quantity to unprecedented accuracy, such as DES1 , LSST2 , Euclid3 and WFIRST4 , draw ever closer, the precision with which the theoretical matter power spectrum is being predicted steadily increases as well. Currently, some of the largest uncertainties on fully non-linear scales come from our incomplete understanding of galaxy formation (e.g. van Daalen et al., 2011), which causes large unwanted biases in the cosmological parameters derived from observations. We expect that we may be able to account for these using independent measurements of, for example, the large-scale gas distribution, and/or to marginalise over these uncertainties using a halo model based approach, although for the largest of these future surveys more effective and less model-dependent mitigation strategies than currently exist will be needed (e.g. Semboloni et al., 2011; Zentner et al., 2013). But even assuming that we can somehow account for the effects of galaxy formation on the distribution of matter, significant challenges remain before we are able to predict the matter power spectrum to the sub-percent accuracy needed to fully exploit future measurements (Huterer & Takada, 2005; Hearin, Zentner & Ma, 2012). These include converging on the “true” simulation parameters in N-body codes, although these too can be marginalised over (Smith et al., 2014). However, with each such marginalisation one should expect the constraining power of observations to be reduced. Direct simulations are not the only way to obtain theoretical predictions of the matter power spectrum, however. Other avenues, such as through the analytical halo model (e.g. Seljak 2000, Ma & Fry 2000; see Cooray & Sheth 2002 for a review), exist, and are widely used in clustering studies. The halo model is based on the assumption that all matter is partitioned over dark matter haloes, which finds its origin in the model proposed by Press & Schechter (1974, , hereafter PS), later extended by Bond et al. (1991). The PS formalism is based on the ansatz that the fraction of mass in haloes of mass M (R) is related to the fraction of the volume that contains matter fluctuations δR > δcrit , where R is the smoothing scale and δcrit is the critical density assuming spherical collapse. If the initial field of matter fluctuations is known, a halo mass function can be derived from this ansatz, which together with a model for the bias b(M ) (the clustering strength of a halo of mass M relative to the clustering of matter) and a description of halo density profiles fully determines the clustering of matter. Much work has been done to improve the predictions of the halo model since its introduction. More accurate mass functions have been derived based on, for 1 http://www.darkenergysurvey.org/ 2 http://www.lsst.org/lsst 3 http://www.euclid-imaging.net/ 4 http://wfirst.gsfc.nasa.gov/ 98 5.1 Introduction example, ellipsoidal collapse (Sheth, Mo & Tormen, 2001), fits to N-body simulations (e.g. Jenkins et al. 2001; Warren et al. 2006; Reed et al. 2007; Tinker et al. 2008; Bhattacharya et al. 2011; Angulo et al. 2012; Watson et al. 2013; see Murray, Power & Robotham 2013 for a comparison of different models) and simulations taking into account the effects of baryons (e.g. Stanek, Rudd & Evrard, 2009; Sawala et al., 2013; Martizzi et al., 2014; Cusworth et al., 2014; Khandai et al., 2014; Cui, Borgani & Murante, 2014; Velliscig et al., 2014). Similarly, much effort has gone into deriving more accurate (scale-dependent) bias functions (e.g. Sheth & Tormen, 1999; Seljak & Warren, 2004; Smith, Scoccimarro & Sheth, 2007; Reed et al., 2009; Grossi et al., 2009; Manera, Sheth & Scoccimarro, 2010; Pillepich, Porciani & Hahn, 2010; Tinker et al., 2010) and concentration-mass relations for halo profiles (e.g. Bullock et al., 2001; Eke, Navarro & Steinmetz, 2001; Neto et al., 2007; Duffy et al., 2008; Macciò, Dutton & van den Bosch, 2008; Prada et al., 2012; Ludlow et al., 2014). Current halo models may incorporate additional ingredients like triaxiality, substructure, halo exclusion, primordial non-Gaussianity and baryonic effects (e.g. Sheth & Jain, 2003; Smith & Watts, 2005; Giocoli et al., 2010; Smith, Desjacques & Marian, 2011; Gil-Marín, Jimenez & Verde, 2011; Fedeli, 2014), and fitting formulae based on the halo model have also been developed (e.g. Smith et al., 2003b; Takahashi et al., 2012). However, the validity of the postulate that the clustering of matter is fully determined by matter in haloes remains relatively untested. Even though matter is known to occupy non-virialised regions such as filaments, their mass may simply be made up of very small haloes itself, although recent results indicate that part of the dark matter accreted onto haloes is genuinely smooth (Angulo & White, 2010b; Fakhouri & Ma, 2010; Genel et al., 2010; Wang et al., 2011). Either way, it is not clear a priori whether this non-virialised matter contributes significantly to the non-linear clustering signal. Here, we examine the contributions of halo and non-halo mass to the matter power spectrum with the use of a set of N-body simulations. We will first investigate the contribution to the redshift zero matter power spectrum of haloes that are defined analogous to the typical halo model approach, also examining the contributions of matter in smaller overdensity regions and outside of haloes. Next, we expand the haloes to include all matter associated to Friends-of-Friends (FoF) groups. Finally, we make predictions for the contribution of halo matter to the power spectrum as a function of both scale and minimum halo mass, which can serve as a test for halo models aimed at reproducing the clustering of dark matter. This chapter is organized as follows. In §5.2 we describe our simulations and the employed power spectrum estimator. We present and discuss our results in §5.3 and summarise our findings in §5.4. 99 Halo matter and the power spectrum Name L400N1024 L200N1024 L050N512 L025N512 Box size [h−1 Mpc] Particle number 400 1024 3 1024 3 200 50 25 512 3 512 3 [h −1 mdm M( ] [h −1 +max kpc] 4.50 × 10 9 4.00 5.62 × 10 8 2.00 7.03 × 10 7 1.00 8.79 × 10 6 0.50 Table 5.1: The different simulations employed in this chapter. From left to right, the columns list their name, box size, particle mass and maximum proper softening length. All simulations were run with only dark matter particles and a WMAP7 cosmology. 5.2 Method 5.2.1 Simulations We base our analysis on a set of dark matter only runs that were run with a modified version of gadget iii, the smoothed-particle hydrodynamics (SPH) code last described in Springel (2005b). The cosmological parameters are derived from the Wilkinson Microwave Anisotropy Probe (WMAP) 7-year results (Komatsu et al., 2011), and given by {Ωm , Ωb , ΩΛ , σ8 , ns , h} = {0.272, 0.0455, 0.728, 0.81, 0.967, 0.704}. We generate initial conditions assuming the Eisenstein & Hu (1998) transfer function. Prior to imposing the linear input spectrum, the particles are set up in an initially glass-like state, as described in White (1994). The particles are then evolved to redshift z = 127 using the Zel’dovich (1970) approximation. The relevant parameters of the simulations we employ here are listed in Table 5.1. The simulation volumes range from 25 h−1 Mpc to 400 h−1 Mpc. The mass resolution improves by a factor of 8 with each step, corresponding to an improvement of the spatial resolution by a factor of 2, from the largest down to the smallest volume. The gravitational forces are softened on a comoving scale of 1/25 of the initial mean inter-particle spacing, L/N , but the softening length is limited to a maximum physical scale of 2 h−1 kpc[L/(100 h−1 Mpc)] which is reached at z = 2.91. As we will demonstrate, by combining these simulations, we can accurately determine the matter power spectrum from linear scales up to k ∼ 100 h Mpc−1 . 5.2.2 Power spectrum calculation The matter power spectrum is a measure of the amount of structure that has formed on a given Fourier scale k, related to a physical scale λ through k = 2π/λ. It is defined through the Fourier transform of the density contrast, δ̂k . We will present our results in terms of the dimensionless power spectrum, defined in the 100 5.2.3 Halo particle selection usual way: ∆2 (k) = * k3 k3 V ) 2 | δ̂ P (k) = | , k 2π 2 2π 2 k (5.1) with V the volume of the simulation under consideration. As all particles have the same mass, the shot noise is simply equal to < |δ̂k |2 >k,shot = 1/Np , with Np the number of particles in the simulation. All power spectra presented here have had shot noise subtracted to obtain more accurate results on small scales. We calculate the matter power spectrum using the publicly available f90 package powmes (Colombi et al., 2009). The advantages of powmes stem from the use of the Fourier-Taylor transform, which allows analytical control of the biases introduced, and the use of foldings of the particle distribution, which allow the dynamic range to be extended to arbitrarily high wave numbers while keeping the statistical errors bounded. For a full description of these methods we refer to Colombi et al. (2009). As in van Daalen et al. (2011), we set the grid parameter to G = 256 and use a folding parameter F = 7 for the two smallest volumes. To calculate the power spectrum down to similar scales for the 200 and 400 h−1 Mpc boxes, we set F equal to 8 and 9, respectively. Our results are insensitive to this choice of parameters. Both box size and resolution effects lead to an underestimation of the power – at least on scales where a sufficient number of modes is available so that the effects of mode discreteness can be ignored (k " 8π/L) – while all simulations show excellent agreement on scales where they overlap (see §5.2). In order to cover the dynamic range from k = 0.01 h Mpc−1 to 100 h Mpc−1 , we therefore combine the power spectra of different simulations. By always taking the largest value of ∆2 (k) at each k. In the case of the full power spectrum, i.e. the power spectrum of all matter, we take the combined power spectrum to be the one predicted by linear theory up to k = 0.12 h Mpc−1 , where the power starts to become nonlinear. While the largest boxes show excellent agreement with the linear power spectrum on these scales, we wish to avoid box size effects as much as possible. For k > 0.12 h Mpc−1 – or, in the case of power spectra of subsets, for the smallest kvalue available – we individually average each power spectrum over each of 25 bins in Fourier space ki and assign the combined power spectrum the largest ∆2 (ki ) of all simulations derived in this way. We combine the power spectra of selections of particles (e.g. all particles that reside in haloes above a certain mass) in a similar way, but without including the linear theory power spectrum. 5.2.3 Halo particle selection In the halo model approach, haloes are commonly defined through a spherical overdensity criterion, usually relative to the mean density of the Universe. In order to investigate the contribution of such haloes to the matter power spectrum, we define our haloes consistently. 101 Halo matter and the power spectrum Overdense regions are identified in our simulations with a spherical overdensity finder, as implemented in the subfind algorithm (Springel et al., 2001). We define a halo as a spherical region with an internal mass overdensity of 200 × Ωm ρcrit , where ρcrit is the critical density of the Universe. These haloes therefore have a mass equal to: 4π 3 Ωm ρcrit R200 M200 = M200,mean = 200 × , (5.2) 3 where R200 = R200,mean is the radius of the region. In the remainder of the chapter, we will define halo particles as any particle with a distance R < R200 from any halo centre. All other particles are treated as non-halo particles, irrespective of their possible FoF group membership, or having been identified as part of a bound subhalo by subfind. While we focus on halo matter as defined through R200 , we will also briefly discuss the contribution of halo matter to the power spectrum for other overdensity regions and halo definitions (i.e. R500 , R2500 , R200,crit and Friends-of-Friends) during the course of the chapter. 5.3 Results 5.3.1 Fractional mass in haloes We first examine the fraction of the mass that resides in haloes, fh . As in each simulation there is a lower limit to the masses of haloes that we can reliably resolve, we compute fh as a function of the minimum mass of the included haloes. Knowing the minimum resolved masses also allows us to estimate over which halo mass range we can probe the contribution of halo particles to the power spectrum in each simulation. The results for fh are shown in Figure 5.1. Different colours are used for each of our four different simulations, as indicated in the legend. Vertical dotted lines denote the masses corresponding to 100 particles. Below these limits the fraction of mass in haloes flattens off, indicating that such low-mass haloes are unresolved. A thick dashed line shows the result of combining the mass fractions of all four simulation for Mmin > 109 h−1 M( , through fh,comb = max(fh,i ). The bottom panel shows the ratio of fh of each simulation to this combined fraction. At the massive end, the high-resolution but low-volume L025 and L050 simulations significantly underestimate fh . This is most clearly seen in the bottom panel: for L025 the mass fraction in haloes becomes significantly underestimated for halo masses M200 " 1011 h−1 M( , while for the L050 box this happens for M200 " 1012 h−1 M( . This is consistent with the points at which the halo mass functions are underestimated for these simulations (not shown). The fluctuations seen in the bottom panel for L200 for M200 > 1014 h−1 M( are due to the rarity of such massive haloes, but as the fraction of the mass residing in such haloes is < 10% this does not impact our conclusions. All simulations in which haloes at a 102 5.3.1 Fractional mass in haloes Figure 5.1: The cumulative fraction of mass inside haloes, fh , as a function of minimum halo mass, for different collisionless simulations as indicated in the legend. The resolution limit, defined as the mass of haloes containing 100 particles, is shown as a vertical dotted line for each simulation. Below this limit, the fraction of mass in haloes is underestimated. For the two highest-resolution simulations these fractions are also significantly underestimated at high masses, as these haloes are under-represented in these small volumes. Between the limits imposed by resolution and box size effects, the simulations are in excellent agreement, and show that the fraction of mass in haloes is ∼ 52% for M200 > 109 h−1 M& . A black dashed line shows the combined result, taking the maximum fraction of mass in haloes between the different simulations at every mass, while the bottom panel shows the fraction of this combined function predicted by each simulation. We also show predictions for the Tinker et al. (2008) mass function as a black dotted and dot-dashed line (see main text). certain Mmin are both well-resolved and well-represented show excellent agreement in fh (M > Mmin ). The fraction of mass in haloes increases with decreasing halo mass. Only ∼ 19% of matter is found in groups and clusters (Mmin > 1013 h−1 M( ), which increases to ∼ 30% for Milky Way haloes and up (Mmin > 1012 h−1 M( ). But even at the lowest resolved mass of roughly 109 h−1 M( , the fraction of mass in haloes is still barely more than 50%. We therefore expect a significant contribution from particles in haloes with M < 109 h−1 M( , and possibly from dark matter particles that do not reside in haloes of any mass, to the matter power spectrum on large scales, which we calculate in the next section. For comparison, the top panel of Figure 5.1 also shows predictions of the fraction of mass in haloes above a certain mass based on the Tinker et al. (2008) M200 halo mass function. Using the normalized halo mass function fit provided by these authors, we have calculated fh (M > Mmin ) under the standard halo 103 Halo matter and the power spectrum model assumption that all mass resides in haloes. The results are shown by the black dotted line. Under this assumption, far more mass is predicted to reside within resolved haloes than we find for our simulations, at any mass. However, the Tinker et al. (2008) mass function converges to a mass density of only about 0.72 × Ωm ρcrit , meaning that either the true mass function predicts far more mass in haloes M200 ! 1011 h−1 M( (roughly the lowest-mass haloes considered by Tinker et al. 2008), or that about 28% of the dark matter mass is genuinely smoothly distributed at z = 0, not residing in haloes of any mass. To compensate for this “missing mass”, we have also calculated the mass in haloes above some Mmin predicted by the Tinker et al. (2008) mass function relative to the total mass in the Universe. This result is shown by the dot-dashed line, and shows much better agreement with our simulations. Up to Mmin ≈ 1012 the relative difference between the Tinker et al. (2008) prediction and our combined result is constant at about 10% before decreasing at higher masses. One possible reason for this discrepancy is that we only count matter in regions where haloes overlap once, which is not taken into account when integrating the mass function. However, we have checked that the mass residing in overlap regions in our simulation is always ! 1.7%, with the largest overlap fraction being found for the most massive haloes. The ! 10% differences found for fh are therefore likely due to the non-universality of the halo mass function at this level of precision (e.g. Tinker et al., 2008; Murray, Power & Robotham, 2013). As it seems that fh (M > Mmin ) continues to rise on mass scales unresolved by our simulations, the total contribution of matter in haloes to the power spectrum will be underestimated in our simulations. However, as we will see in §5.3.2.1, this depends on the scale considered. A range in Fourier space exists where the fraction of power from halo particles is bounded below unity, and the contribution of haloes with masses M200 ! 1011 h−1 M( is negligible. Additionally, on scales where this does not hold we can still constrain the contribution from haloes above a certain mass. In the remainder of the chapter, we will only consider particles residing in haloes with M200 > 109 h−1 M( to be halo particles, as this corresponds roughly to the smallest haloes we can resolve. 5.3.2 Halo contribution to the power spectrum We first show the full dimensionless matter power spectrum, i.e. using all particles, in Figure 5.2. Here each simulation is shown by a different colour, and it is immediately clear that no single one is converged over the full dynamic range up to k ∼ 100 h Mpc−1 . The linear theory power spectrum, as generated by the f90 package camb (Lewis, Challinor & Lasenby, 2000, , version January 2010), is shown as a long dashed purple line. L400 and L200 show good agreement with the linear power spectrum on scales where non-linear evolution is negligible (k ! 0.12 h Mpc−1 ) and a sufficient number of modes is available (k > 0.04 and 0.08 h Mpc−1 respectively, roughly corresponding to λ = 0.4 L), while L050 and 104 5.3.2 Halo contribution to the power spectrum Figure 5.2: The dimensionless power spectrum derived from each simulation, along with the linear power spectrum (long-dashed purple line) and the combined power spectrum (dashed black line). While the L025 and L050 simulations significantly underestimate the power on large scales due to missing modes, their high resolution allows us to accurately extend the power spectrum of the larger volumes up to k ∼ 100 h Mpc−1 . The erratic behaviour seen for lowresolution simulations at large k is due to shot noise subtraction. The bottom panel shows for each simulation (as well as for the linear theory prediction) the fraction of power relative to the combined power spectrum. For k < 20 h Mpc−1 , multiple simulations show the same results, indicating convergence on these scales. L025 show severe box size effects due to their lack of large-scale modes. These box size effects become negligible only for k > 10 and k > 40 h Mpc−1 respectively. Due to their finite resolution, all simulations underestimate the power on sufficiently small scales. Note that all power spectra shown here have had shot noise subtracted, which explains the erratic behaviour of the power spectra on the smallest scales. The underestimation of small-scale power becomes significant already on scales corresponding to ∼ 100 softening lengths. However, for every k ! 100 h Mpc−1 , there is at least one simulation for which neither box size nor resolution leads to an underestimation of the power at the " 1% level. We therefore combine the different power spectra as described in §5.2.2 to obtain the combined power spectra, shown as a dashed black line. The bottom panel of Figure 5.2 shows the fraction of power predicted by each simulation, as well as the fraction predicted by linear theory, relative to the combined power spectrum. By construction, this fraction is bounded to unity on 105 Halo matter and the power spectrum Figure 5.3: The combined power spectrum for different sets of particles: R < R200 (halo particles), R < R500 , R < R2500 and R > R200 (non-halo particles). Only haloes with M > Mmin = 109 h−1 M& were considered in the cuts made, which in total contain about 38% of all dark matter. The halo particles easily dominate the power on small scales; however, there is a significant range of non-linear scales (k < 1.4 h Mpc−1 ) where this subset does not provide the most power. While the non-halo particles account for a larger fraction of the power for k < 0.6 h Mpc−1 , it is the cross-terms between the halo and non-halo particles (not shown) which dominate in the mildly non-linear regime. Note that the horizontal range has been shortened relative to Figure 5.2. non-linear scales. Note that on scales k ! 20 h Mpc−1 , the fractions of multiple simulations are within a few percent of unity, indicating convergence on these scales. For smaller scales, however, convergence is uncertain, although based on the results for larger scales we expect our combined power spectrum to be accurate to ∼ 1% up to k ∼ 100 h Mpc−1 . Next, we repeat this procedure for halo and non-halo particles. We also consider particles within the R500 and R2500 overdensity regions, defined analogously to R200 , which probe the inner parts of haloes. As we cannot reliably resolve haloes with less than about 100 particles with our highest-resolution simulation, we only consider the contribution of haloes with masses M > Mmin = 109 h−1 M( here, treating matter in lower-mass haloes as non-halo particles. The results are shown in Figure 5.3. Note that for clarity only the combined power spectra are shown, and that the horizontal range has been shortened with respect to Figure 5.2, only showing the range of scales for which we can reliably determine the power spectrum. 106 5.3.2 Halo contribution to the power spectrum The contribution from halo particles strongly dominates the power on small scales. The halo contribution is in turn dominated by the very inner regions of haloes, at least on scales smaller than the size of these regions. However, towards larger scales this contribution diminishes, and for k < 0.4 h Mpc−1 less than half of the total power is provided by matter in haloes alone. On large scales the significant fraction of the mass that occupies non-virialised regions becomes more important, increasing to about 20%, roughly half of the contribution of halo matter on the same scales. The remaining ∼ 40% of the total matter power on large scales is therefore contributed by the cross-terms of halo and non-halo matter (not shown here). Note that on the scales shown here, only L400 and L200 contribute to the combined power spectrum of non-halo particles. However, as these two are in excellent agreement for k " 0.4 h Mpc−1 even though the mass resolution is eight times worse in L400, we do not have reason to believe that this component would significantly change on non-linear scales if lower-mass haloes were resolved. On linear scales, the contribution of halo matter is mostly determined by the fraction of mass in haloes, which does of course depend on the minimum halo mass resolved. We will return to this point in §5.3.2.1. We investigate the contribution of halo matter in more detail in Figure 5.4, which shows the ratio of the power spectrum of matter within R200 of haloes with masses M200 > 109 h−1 M( to the power spectrum of all matter. The black dashed line shows the ratio of the combined power spectra, obtained from the smoothed power spectra of all four simulations shown here as described in §5.2.2, relative to the combined total power spectrum (black line in Figure 5.3). The solid lines show the relative contribution of halo matter in each simulation separately,. The contribution of halo matter to the total power increases with decreasing physical scale. On large (linear) scales, the contribution from haloes seems to converge to ∼ 30%, in good agreement with fh (M > 109 h−1 M( )2 ≈ 0.27. This is expected, as the contribution of any subset of matter to the power spectrum on linear scales should scale only with (the square of) the fraction of mass contained in such a subset. However, as the fraction of power in haloes on large scales is fully determined by L200 and L400, with both predicting roughly the same fraction as can be seen in Figure 5.4, while the fraction of mass in haloes M > 109 h−1 M( is only accurately measured for L025, this correspondence is actually surprising. On non-linear scales the ratio rapidly increases down to physical scales of λ ∼ 2 h−1 Mpc (k ∼ 3 h Mpc−1 ), reaching at most 95%, before slowly levelling off towards smaller scales. Note that the combined results are fully determined by L050 around k ≈ 20 h Mpc−1 , where we are unable to show convergence due to the too-low resolution of L200 and too-small volume of L025. However, the results of §5.3.2.1 imply that little would change on these scales if higher-resolution simulations were available. While L400 and L200 are in good agreement for 0.2 ! k ! 10 h Mpc−1 , on subMpc scales the contribution of halo matter to the total matter power spectrum starts to show a strong dependence on resolution. On these scales fluctuations 107 Halo matter and the power spectrum Figure 5.4: The fraction of power within haloes with masses M200 > 109 h−1 M& as a function of scale. A dashed black line again shows the combined power spectrum derived from the smoothed power spectra of the four simulations employed in this chapter, each of which is shown as well. The halo contribution rapidly rises down to λ ∼ 2 h−1 Mpc, peaking at ∼ 95% for k ≈ 20 h Mpc−1 (λ ≈ 300 h−1 kpc) and remaining roughly constant for larger k. The power spectrum on smaller scales is dominated by increasingly smaller haloes, while the power spectrum on the largest scales depends mainly on the total mass fraction. The grey dashed line shows the result if R200,crit is used instead of R200 . within the same halo dominate the power spectrum (i.e. the 1-halo term in halo model terminology), so naturally the contribution to the power will be underestimated on scales λ ! R200,min , where R200,min is the virial radius of a halo with the minimum resolved mass, Mmin, in that particular simulation. In practice, the power is significantly underestimated already on larger scales, due to the gravitational softening employed in the simulation, which leads to an underestimation of the inner density of haloes. This leads to a significant power deficit on scales λ ! 100 +max (see Table 5.1). Fortunately, the combination of simulations chosen here still allows us to probe the contribution of halo matter up to kmax ∼ 100 h Mpc−1 . As the power on sub-Mpc scales is dominated by the 1-halo term, adding lowermass haloes than those resolved here will have a negligible impact on the measured contribution of halo matter on the scales considered here, as 2π/R200,min > kmax . Therefore, 5 − 7% of small-scale power is unaccounted for by halo particles, regardless of resolution effects. Instead, it is the cross-term between halo matter and matter just outside the R200 regions that makes up the deficit. 108 5.3.2 Halo contribution to the power spectrum Figure 5.5: As Figure 5.4, but now for all mass inside FoF groups with MFoF > 109 h−1 M& . While the scale dependence is very similar (i.e. a rapid rise down to λ ∼ 1 h−1 Mpc and roughly constant on smaller scales), the contribution to the power spectrum is higher than for the R200 overdensity regions on any scale. The contribution of halo matter power spectrum is increased by 5 − 10% on all scales relative to the results of Figure 5.4, and matter in FoF groups accounts for essentially all power on scales k > 3 h Mpc−1 . This implies that the R200 overdensity regions do not fully capture the halo. To demonstrate that this is indeed the case, we calculate the contribution of matter in FoF groups (with a linking length of 0.2) to the total power spectrum, with a mass limit of MFoF > 109 h−1 M( . The results are shown in Figure 5.5. Here we see that, while the scale-dependence of the contribution is similar to that shown in Figure 5.4 for the R200 regions, the contribution is significantly larger on all scales, and is essentially 100% for k " 3 h Mpc−1 . This Fourier scale corresponds well to the virial radius of the largest clusters in the simulation. On scales k ! 0.3 h Mpc−1 , the contribution of matter in FoF groups is consistently ∼ 25% higher than that of matter in R200 haloes. Correspondingly, the fraction of mass in FoF groups is also higher than the fraction of mass in R200 haloes at every mass, from 10% higher at M = 109 h−1 M( to 15% higher at M = 1014 h−1 M( (not shown). Finally, we also show the results if R200,crit is used instead (with M200,crit > 109 h−1 M( ), as a dashed grey line in Figure 5.4. As such an overdensity criterion picks out smaller regions than R200 , containing less mass, the contribution of halo matter to the power spectrum is also smaller, especially on large scales. On subMpc scales, however, the differences are small, with the contribution to the power 109 Halo matter and the power spectrum spectrum of halo matter peaking at 94%. We conclude that what region is chosen to represent a halo has a large impact on the contribution of haloes to the matter power spectrum, in a scale-dependent way. In what follows, we will continue to define haloes using the mean overdensity criterion, as this is typically used in the halo model approach. 5.3.2.1 Contribution as a function of mass To see which haloes contribute most to the matter power spectrum as a function of scale, while simultaneously examining the dependence of our results on the mass of the lowest resolved halo, we turn to Figure 5.6. Each panel corresponds to a different simulation and each curve to a different minimum halo mass. The halo contributions are shown relative to the combined power spectrum of all matter (black line in Figure 5.3). The legend shows the minimum halo mass log10 (Mmin /[M( /h]). Note that the minimum masses differ for each simulation, because these are based on the particle masses of the simulations, in such a way that the first fractional contribution to the power shown includes matter in haloes of 100 particles or more, corresponding to our imposed resolution limit in mass (see Figure 5.1). The minimum halo mass increases by half a dex with each step. Grey regions indicate the approximate scales on which the full matter power spectrum of the simulation is not converged to ∼ 1% with respect to the combined one. While this gives an indication of which scales to trust, note that the relative contribution of each halo mass may to be converged for a different range of scales. Finally, the bottom half of each panel shows the difference between consecutive lines, i.e. the contribution added by decreasing the minimum halo mass by half a dex. Here f∆,i ≡ ∆2200,i /∆2all . As we will show shortly, while the relative contributions of haloes of a certain mass shown in the bottom halves of the panels can be compared between different simulations, the same does not hold for the absolute contributions, as box size effects play an important role on a large range of scales. Several things can be learned from this figure. First, we can ask whether we are resolved with respect to the minimum mass, and on which scales. Lowmass haloes become increasingly important towards larger values of k on sub-Mpc scales. As we discussed before, we therefore do not expect to be converged on scales k " 2π/R200,min , limiting us to k ! 100 h Mpc−1 when all simulations are combined. Interestingly, on scales k ∼ 10 h Mpc−1 , roughly where the contribution from halo matter plateaus, the L200 simulation (top right panel) is just about converged with minimum halo mass. This can be seen most clearly by comparing the relative halo contribution on this scale at approximately fixed minimum halo mass between different simulations: as for example the L050 simulation (bottom left panel) shows, well-resolved haloes below ∼ 1011 h−1 M( (about a factor of two above the halo resolution limit of L200 ) provide a negligible contribution (< 1%) at k ≈ 10 h Mpc−1 , indicating convergence on this scale. Indeed, both simulations agree that most of the power at k ∼ 10 h Mpc−1 comes from haloes with masses M200 " 1013 h−1 M( . These group and cluster-scale haloes remain 110 5.3.2 Halo contribution to the power spectrum Figure 5.6: Comparison of the contribution of haloes above a certain mass to the matter power spectrum relative to the total combined power spectrum of all simulations, for L400 (top left), L200 (top right), L050 (bottom left) and L025 (bottom right). The legend shows the minimum halo mass log10 (Mmin /[M& /h]). Note that lines of the same colour do not correspond to the same minimum halo mass in the four panels, as the binning is based on the minimum resolved halo mass (see text). The grey regions denote where box size or resolution effects are " 1% for the full power spectrum; the relative contribution may be converged on a different range, as can be seen by comparing the panels. The bottom half of each panel shows the difference between consecutive lines, i.e. the contribution added by decreasing the minimum halo mass by half a dex. Note that scales exist where we are converged with minimum halo mass: for example, around k = 4 h Mpc−1 haloes with M200 < 1011 h−1 M& contribute negligibly to the total matter power spectrum. For 1 < k < 10 h Mpc−1 , the power spectrum is dominated by the contributions of haloes with M200 > 1013 h−1 M& , even though these account for only about 19% of the total mass. On large scales, haloes in the full mass range probed here contribute significantly to the power, and no convergence is obtained. Note that box size effects strongly influence the contributions of large haloes measured for the smallest two boxes, especially in the case of L025. 111 Halo matter and the power spectrum the dominant contributors on somewhat larger scales as well, their contribution peaking around k = 2 − 3 h Mpc−1 before gradually falling off. Note that haloes with M200 > 1013 h−1 M( account for only about 19% of the total mass (see Figure 5.1). On larger scales, convergence is obtained for increasingly larger minimum halo masses: as the bottom halves of the panels show, around k = 4 h Mpc−1 haloes with masses M200 < 1011 h−1 M( do not make a significant contribution to the power spectrum. While the simulations shown here do not predict exactly the same contribution to the power spectrum for haloes with M200 > 1011 h−1 M( , due to the limited box size of L050 and L025, it is clear that the results for the relevant importance of such haloes are converged. However, for k < 1 h Mpc−1 the contributions of haloes which are not well resolved by L400 and L200 once again become important, and even L050 may no be longer converged with minimum halo mass at the 1% level. This is because on these scales the matter power spectrum is dominated by the cross-correlation of matter in different haloes (i.e. the 2-halo term in halo model terminology). Therefore, while the bottom panels for each simulation do show that the relative contribution of haloes on small enough scales decreases with decreasing M200 , on scales k ! 1 h Mpc−1 our results can only provide a minimum contribution of halo matter to the total matter power spectrum. As we noted before, the panels show that even when compared at roughly the same minimum halo mass, different simulations make different predictions for the contributions of haloes above a certain mass to the matter power spectrum. When the box size decreases, the contribution on large scales and that of highmass haloes decreases as well. This is expected, as large-scale modes are missed in the smaller boxes and massive haloes are under-represented. However, the role of low-mass haloes is simultaneously overestimated in the smaller boxes, and the total contribution of halo matter at fixed minimum halo mass is therefore higher than it should be. To demonstrate explicitly that this is the case, we show in Figure 5.7 again the results for L200 (top-right panel in Figure 5.6), but with the results for L100 superimposed as dashed lines. The L100 simulation has 5123 particles and therefore the same resolution as L200, but in an 8× smaller volume. Comparing the two simulations therefore shows the effects of box size at fixed resolution. On large scales and for high-mass haloes, the contribution of halo matter is underestimated in L100, relative to L200. Meanwhile, the contribution of low-mass haloes on small scales tends to be overestimated, even though the resolution is identical. Interestingly, there are mass and spatial scales where the simulations are in perfect agreement, such as for a minimum halo mass of 1013.75 h−1 M( where k > 3 h Mpc−1 . Most important, however, is that the contributions at a certain halo mass shown in the bottom half of the panel are in perfect agreement over the entire range of scales, excepting the very highest mass bin (which is under-represented in L100 ) and the principal modes. This shows that we can still derive the correct contribution of haloes within a certain mass range, and investigate whether we are converged with 112 5.4 Summary & conclusions Figure 5.7: As the top-right panel of Figure 5.6, but with the results for L100 added for the same minimum halo masses as dashed lines, showing the effects of box size at fixed resolution. Due to the missing large-scale modes in L100, the large-scale contribution is underestimated. Additionally, high-mass haloes are under-represented and the role of low-mass haloes on small scales is overestimated. However, the relative contributions of haloes of a certain mass shown in the bottom half of the figure are in excellent agreement for all but the highest mass bin. mass on a certain scale, even when box size effects play a role. 5.4 Summary & conclusions In this work we investigated the contribution of haloes to the matter power spectrum as a function of both scale and halo mass. This was motivated by the assumption typically made in halo-based models that all matter resides in spherical haloes of size R200 . To do so, we combined a set of cosmological N-body simulations to calculate the contributions of different spherical overdensity regions, FoF groups and matter outside haloes to the power spectrum. Our findings can be summarised as follows: • On scales k < 1 h Mpc−1 , haloes – defined as spherical regions with an enclosed overdensity of 200 times the mean matter density in the Universe – with masses M200 ! 109.5 h−1 M( , which are not resolved here, may signif113 Halo matter and the power spectrum icantly contribute to the matter power spectrum. For 2 < k < 60 h Mpc−1 , our simulations suggest their contribution to be < 1%. • For k " 2 h Mpc−1 , the minimum mass of haloes that contribute significantly to the matter power spectrum decreases towards smaller scales, with more massive haloes becoming increasingly less important. • On scales k < 2 h Mpc−1 , matter in haloes accounts for less than 85% of the power in our simulations. Its relative contribution increases with increasing Fourier scale, peaking at ∼ 95% around k = 20 h Mpc−1 . On smaller scales, its contribution is roughly constant. When R200,crit is used to define haloes instead of the fiducial R200,mean , the contribution of haloes to the power spectrum decreases significantly on all scales. • For 2 ! k ! 10 h Mpc−1 , haloes below ∼ 1011 h−1 M( provide a negligible contribution to the power spectrum. The dominant contribution on these scales is provided by haloes with masses M200 " 1013.5 h−1 M( , even though such haloes account for only ∼ 13% of the total mass. • Matter just outside the R200 overdensity regions, but identified as part of FoF groups, provides an important contribution to the power spectrum. Taken together, matter in FoF groups with MFoF > 109 h−1 M( accounts for essentially all power for 3 < k < 100 h Mpc−1 . Switching from R200 to FoF haloes increases the contribution of halo matter on any scale probed here by 5 − 15%. As we have demonstrated, the halo model assumption that all matter resides in (spherical overdensity) haloes may have significant consequences for the predictions of the matter power spectrum. Specifically, we expect such an approach to overestimate the contribution of haloes to the power on small scales (k " 1 h Mpc−1 ), mainly because it ignores the contribution of matter just outside R200 to the power spectrum. While defining haloes to be larger regions similar to FoF groups mitigates the small-scale power deficits, the fact that such regions are often nonvirialised and typically non-spherical may lead to other problems. Clearly, the validity of the postulate that the clustering of matter is fully determined by matter in haloes is strongly dependent on the definition of a halo used – but it is hard to say what the “best” definition to use in this context is. For example, while haloes defined through R200,crit will be more compact and therefore have a smaller overlap fraction than R200,mean or FoF haloes, their contribution to the power spectrum will be smaller for the same minimum halo mass. And while FoF groups seem to contain nearly all mass important for clustering, the fact that they are not completely virialised, may be non-spherical and have boundaries which not necessarily correspond to some fixed overdensity (e.g. More et al., 2011) prohibit their use in traditional halo based models. Some optimal choice of halo definition may exist, but whatever definition one uses, it remains difficult to say what the contribution to the matter power spec114 5.4 Summary & conclusions trum of haloes below the resolution limit is. Convergence with mass is extremely slow over a large range of scales, and as the results of §5.3.1 show, apparent convergence over a decade in mass is no guarantee, as successive decades in mass may contribute equally. This makes it extremely difficult to give a definitive answer on whether mass outside haloes significantly contributes to the power spectrum, or even if such mass exists: for example, we cannot exclude the possibility that matter outside R200 but inside FoF groups itself consists purely of very small R200 haloes. Therefore, any claims about the role of haloes or the mass contained in them needs to be quoted together with a minimum halo mass in order to have a meaningful interpretation. Acknowledgements The authors thank Simon White for useful discussions. The simulations presented here were run on the Cosmology Machine at the Institute for Computational Cosmology in Durham (which is part of the DiRAC Facility jointly funded by STFC, the Large Facilities Capital Fund of BIS, and Durham University) as part of the Virgo Consortium research programme. We also gratefully acknowledge support from the European Research Council under the European Union’s Seventh Framework Programme (FP7/2007-2013) / ERC Grant agreement 278594GasAroundGalaxies. 115 6 The galaxy correlation function as a constraint on galaxy formation physics Semi-analytical models of galaxy formation are generally successful in reproducing the number densities of galaxies as a function of mass. In order to remove possible degeneracies and improve the model, having additional orthogonal constraints like clustering data while exploring parameter space would be useful. However, this is challenging due to the two-point nature of such quantities, which makes using them as a constraint computationally very expensive, as the model would have to be run on the full halo catalogue at every step. Here, we present a fast estimator for the projected galaxy correlation function that produces ∼ 10% accurate results using only a very small subsample of haloes. As a first application, we incorporate it in a recent version of the Munich semi-analytical model and find a set of galaxy formation parameters that simultaneously reproduces the observed z = 0 stellar mass function and clustering data from SDSS. Marcel P. van Daalen, Bruno M. B. Henriques, Raul E. Angulo and Simon D. M. White In preparation Constraining galaxy formation through clustering 6.1 Introduction Galaxy formation is currently an unsolved problem. Because of this, any model of galaxy formation – be it hydrodynamical, analytical or semi-analytical in nature – has to rely on some set of observations in order to constrain the parameters of the physical processes that cannot be derived from first principles, or be simulated directly. Hydrodynamical simulations can simulate baryonic processes directly on large scales while relying on sub-grid recipes to model relevant processes below the resolution limit. As such simulations are relatively expensive computationally, the values of the parameters in the sub-grid formulations usually have to be informed by comparing a set of simulations run at lower resolution or in smaller volumes to some observational quantity, though these numerical settings themselves may impact which parameter values are “right” for it. Still, as the available computational resources are ever growing, the number of processes which cannot be simulated directly is slowly decreasing (e.g. Hopkins et al., 2013), and valiant efforts are currently being made to improve the accuracy of direct cosmological simulations (e.g. EAGLE, Schaye et al., in preparation). Semi-analytical models (hereafter SAMs), on the other hand, necessarily include more physical parameters to calibrate, as baryonic processes are not simulated directly on any scale. However, once the high-resolution collisionless simulations that they are based on have been run a single time, they can be repeated many times with different parameter values at low computational cost. Coupled with a method to efficiently explore parameter space such as Monte Carlo Markov Chains (MCMC, for a review on this and similar methods see Trotta, 2008), this allows one to find the highest-likelihood set of parameters for any given model, based on a set of observational constraints. Typically, SAMs use observational data sets of one-point functions, such as stellar mass or luminosity functions, as constraints for their model parameters (e.g. Kauffmann, White & Guiderdoni 1993; Baugh, Cole & Frenk 1996; Somerville & Primack 1998; Kauffmann et al. 1999; Cole et al. 2000; Croton et al. 2006; Bower et al. 2006; Monaco, Fontanot & Taffoni 2007; Somerville et al. 2008; Henriques et al. 2009; Guo et al. 2011; Henriques et al. 2013, see Baugh 2006 for a review on the general methodology). The resulting models of galaxy formation can then be tested against other observables (i.e. observables that are independent of those used as constraints) and be used to make predictions for these. A delicate balance must be maintained here: if the model has too many free parameters, prior regions that are too wide, or if there are too few (independent) observational constraints, degeneracies may occur (i.e. separate regions of high likelihood in parameter space), while too little freedom or failing to include some relevant physical process may leave the model unable to match several observables at once. SAMs generally have trouble matching the small-scale clustering of galaxies while simultaneously matching other observational constraints such as the luminosity function (e.g. Kauffmann et al. 1999; Springel et al. 2005; Li et al. 2007; Guo 118 6.2 Method et al. 2011; Kang et al. 2012; but see e.g. Kang 2014). In order to determine the cause of this discrepancy, and to test whether the models retain enough freedom to match the observed clustering at all, it would be instructive to use clustering measurements as constraints while exploring parameter space. As galaxy clustering is determined by how galaxies with different properties populate haloes of different mass, it directly constrains galaxy formation, in a way that is complementary to, for example, the luminosity function. However, this presents a problem: while one-point functions such as the stellar mass function can be quickly estimated with known uncertainty by running the model on only a small sample of representative haloes, allowing large regions of parameter space to be rejected without having to run the model on the full dark matter simulation, the same cannot be done simply for two-point functions such as the correlation function. In principle, any observable that relies on spatial correlations between galaxies can only be calculated by running the model on the full simulation, which is computationally infeasible when thousands of models need to be explored. While running the SAM on a small sub-volume may allow one to measure small-scale correlations to some degree, cosmic variance will be an issue. Additionally, if one aims to compare to observations, where clustering is viewed in projection (unless line-of-sight velocities are used), one still has to account for large-scale correlations, even at small separations. Here, we present a method to quickly estimate the projected correlation function, w(rp ), to some known uncertainty from a small sample of haloes using a halo model based approach, and apply it to constrain the recent version of the Munich semi-analytical model presented in Guo et al. (2013, , hereafter G13). By measuring the properties of galaxies within individual haloes and making informed assumptions about the distribution of these haloes, we are able to circumvent the aforementioned problems, greatly reducing the CPU time needed to predict their two-point clustering. This chapter is organised as follows. In Section 6.2, we present our method for estimating w(rp ) and briefly describe the semi-analytical model we apply it to. Next, in Section 6.3, we show the results of using clustering as an additional constraint on parameter space, on top of the often-used z = 0 stellar mass function. Finally, in Section 6.4 we present a summary of our work and discuss future improvements and applications. 6.2 Method 6.2.1 Estimating the correlation function Our approach is slightly different to that of most previous works constructing a correlation function estimator based on the halo model, where the aim is typically to reproduce observations given some halo occupation distribution (HOD). Here, our goal is instead to reproduce the results of the semi-analytical model run on the full dark matter simulation to within some given accuracy, given the galaxy 119 Constraining galaxy formation through clustering properties for a small sample of haloes. As we will show, we are able to reproduce the projected correlation function of the full galaxy sample to within about 20%, using the properties of semi-analytical galaxies occupying less than 0.04% of the full halo sample (0.14% of the subhalo sample). 6.2.1.1 The backbone of the model Our starting point is the linear halo model, introduced independently by Seljak (2000), Ma & Fry (2000) and Peacock & Smith (2000). In what follows, we will adhere to the terminology of Cooray & Sheth (2002). In the analytical halo model the power spectrum, P (k), is written as the sum of two terms: P (k) = P 1h (k) + P 2h (k). (6.1) Here P 1h (k) is the 1-halo term, describing the two-point clustering contribution of points within the same halo, and P 2h (k) is the 2-halo term, describing the contribution of points within separate haloes. For the clustering of matter, these are given by: . /2 M 1h Pdm (k) = n(M ) |u(k|M )|2 dM ρ̄ . . / / -M1 M2 2h Pdm (k) = n(M1 ) u(k|M1 )n(M2 ) u(k|M2 ) × ρ̄ ρ̄ Phh (k|M1 , M2 )dM1 dM2 . (6.2) Here M = M200mean is the halo mass definition1 we will be using throughout, n(M ) is the halo mass function, ρ̄ is the mean matter density of the Universe, u(k|M ) is the normalised Fourier transform of the density profile of a halo of mass M , and Phh (k|M1 , M2 ) is the halo-halo power contributed by two haloes of masses M1 and M2 on a Fourier scale k. We can rewrite the latter term assuming a linear scale-independent bias relation, Phh (k|M1 , M2 ) = b(M1 )b(M2 )Plin (k), where b(M ) is the halo bias and Plin the linear theory matter power spectrum. We then obtain: 0. / 12 M 2h Pdm (k) = Plin (k) n(M )b(M ) (6.3) u(k|M )dM . ρ̄ From these expressions, one can easily derive a model for the galaxy power spectrum. For this we assume that the number of galaxies scales with the halo mass M ; specifically, M ∝ ,Ngal |M - and M 2 ∝ ,Ngal (Ngal − 1)|M -, leading to: ,Ngal (Ngal − 1)|M 1h Pgal (k) = n(M ) |ugal (k|M )|p dM n̄2gal 012 ,Ngal |M 2h Pgal (k) = Plin (k) n(M )b(M ) ugal (k|M )dM . (6.4) n̄gal 1M 200mean is the mass within a spherical region with radius R200mean and internal density 200 × ρ̄ = 200 × Ωm ρcrit . 120 6.2.1 Estimating the correlation function 2 Here the mean number density of galaxies is given by ngal = n(M ) ,Ngal |M - dM . Note that we have followed Cooray & Sheth (2002) in replacing the normalised Fourier transform of the halo density profile, u(k|M ), by one describing the distribution of (satellite) galaxies, ugal (k|M ), and subsequently in changing the powerlaw index on this term in the 1-halo term by p. This is often done in the literature in order to be able to differentiate between contributions from central-satellite and satellite-satellite terms, with p = 1 for the former and p = 2 for the latter, based on the value of ,Ngal (Ngal − 1)-. ,Ngal |M - – the most common form of the HOD – is often separated into contributions from centrals and satellites as well, with the former (Ncen ) following a roughly lognormal distribution with respect to M , and the latter (Nsat ) being very well approximated by a (linear) power law (e.g. Guzik & Seljak, 2002; Kravtsov et al., 2004; Zehavi et al., 2005; Tinker et al., 2005; Zheng et al., 2005). From this approximate expressions for ,Ngal (Ngal − 1)- in terms of Ncen and Nsat can be derived as well. However, as our aim is to reproduce the results of the semi-analytical model, for which information on the HOD and the galaxy type is much more readily available than for observations, we can explicitly separate the contributions from central and satellite galaxies to the galaxy power spectrum without approximation. Keeping in mind that a halo will contain at most one central, meaning that ,Ncen (Ncen − 1)|M - = 0, that ,Ncen Nsat |M - = ,Nsat Ncen |M -, and using that central galaxies reside in the centre of the halo and should therefore not be weighted by the profile, we derive: ,Ncen Nsat |M 1h Pgal (k) = 2 n(M ) [ugal (k|M ) − W (kR)] dM + n̄2gal 4 ,Nsat (Nsat − 1)|M - 3 n(M ) |ugal (k|M )|2 − W (kR)2 dM 2 n̄gal 0,Ncen|M 2h Pgal (k) = Plin (k) n(M )b(M ) dM + n̄gal 12 ,Nsat |M n(M )b(M ) ugal (k|M )dM . (6.5) n̄gal Note that we have followed Valageas & Nishimichi (2011) in adding a counterterm to the halo profiles in the 1-halo term, which ensures the 1-halo term goes to zero for k → 0. Here W (kR) is the Fourier transform of a spherical top-hat of radius R(M ) = [3M/(4π ρ̄)]1/3 , given by: . / sin(kR) cos(kR) − W (kR) = 3 . (6.6) (kR)3 (kR)2 In our model, we take Plin (k) to be the realised linear input power spectrum from the dark matter initial conditions. We calculate the halo mass function, n(M ), directly from the dark matter simulation as well and spline-fit the results. Furthermore, we use the fit for the M200mean halo bias function provided by Tinker 121 Constraining galaxy formation through clustering et al. (2010) for b(M ), and compute each of the four HOD terms directly from the SAM run on our halo subsample, spline-fitting these results as well. 6.2.1.2 The galaxy distribution The normalised Fourier transform of the galaxy distribution, ugal (k|M ), is often derived from the dark matter mass profile of the halo. This in turn is usually assumed to be equal to the Navarro, Frenk & White (1997, , NFW) profile, cut off at the virial radius rvir = R200mean , with some concentration-mass relation c(M ): ρNFW (r) = ρ0 , (r/rs )(1 + r/rs )2 (6.7) where rs = rvir /c is the scale radius. The main advantage of using the oneparameter NFW profile is that this leads to an analytic expression for u(k|M ). However, many authors have shown that the Einasto (1965) profile provides a more accurate fit to the mean profile of haloes of a given mass, and to the distribution of dark matter substructure (e.g. Navarro et al., 2004; Merritt et al., 2005, 2006; Gao et al., 2008; Springel et al., 2008; Stadel et al., 2009; Navarro et al., 2010; Reed, Koushiappas & Gao, 2011; Dutton & Macciò, 2014). The two-parameter Einasto density profile is given by: $ 15 0. /α r 2 ρEin (r) = ρ0 exp − −1 , (6.8) α rs where the shape parameter α allows additional freedom in the slope of the profile. This function does not have an analytic Fourier transform, and an extra numerical integration step is therefore needed when replacing the NFW profile by an Einasto one. The larger degeneracies in fitting a two-parameter model also mean more data points are needed to obtain a reliable fit. Still, when the computational expense is acceptable and enough information on the measured profile is available, the increased accuracy may be worth the cost. We find that the Einasto profile provides an excellent fit to the distribution of satellite galaxies in the inner parts of haloes in our simulation. But even the Einasto profile over-predicts the number of galaxies at large radii, r " 0.7rvir . Additionally, standard practice is to cut off the profile at the virial radius, while we find that ∼ 10% of the satellite galaxies in our simulation are found at distances 1 < r/rvir < 3. Note that these galaxies are not necessarily outside the virialised region, as haloes are typically not spherical objects. We therefore seek a profile with the same small-scale behaviour as the Einasto profile, while simultaneously fitting the galaxy distribution out to ∼ 3rvir . We find that the following functional form, which we refer to here as the “gamma” profile, is capable of providing an excellent match to the galaxy distribution over the full range of scales we consider, and at any halo mass: ' r (ac−3 6 ' r (c 7 ng (r) = n0 . (6.9) exp − b b 122 6.2.1 Estimating the correlation function Figure 6.1: Galaxy number density profiles for all Guo et al. (2011) galaxies with stellar masses 10.27 < log10 (M∗ /M& ) < 10.77, for five different halo mass bins (shown in different colours). The legend shows the mean logarithmic mass in each of the bins. Solid lines indicate the measured profiles, while dashed lines show the best-fit gamma profiles (see equation 6.12). The halo mass bins are dynamically chosen such that each contains roughly the same number of galaxies, and the fits are performed using 30 radial bins spaced equally in log-space between log10 x = −2.5 and log10 x = 0.5. This fitting function has three parameters, a, b and c. Note that the role of b is similar to that of rs in the Einasto profile. Both the Einasto and gamma profiles are near universal if defined in terms of x ≡ r/rvir . If we rewrite both profiles in terms of x and integrate them to obtain N (< r), the similarities and differences between the profiles are most easily appreciated. For the Einasto profile: ' (α 9 8 γ α3 , α2 rxs ' (α 9 , NEin (< r) = Ntot 8 (6.10) γ α3 , α2 xmax rs while for the gamma profile: 3 + ,c 4 γ a, xb Ng (< r) = Ntot 3 + xmax ,c 4 . γ a, b (6.11) Here γ(a, b) is the lower incomplete gamma function, and we have assumed the profiles cut off at some xmax . The similarities in the two profiles are clear, and the main difference is that the two parameters of the gamma function are independent for the gamma profile, which effectively allows for a steeper profile at large x and consequently a better match to the galaxy distribution around the virial radius. In practice, we fit a normalised number density profile ng (r)/ ,Ng - to the galaxy distribution before numerically Fourier transforming this to obtain ugal (k|M ). For 123 Constraining galaxy formation through clustering completeness, ng (r)/ ,Ng - is given by: 6 ' x (c 7 ' x (ac−3 ng (r) c 3 + , 4 . = exp − c 3 γ a, xmax ,Ng b b 4πb3 rvir b (6.12) In our model we set xmax = 3, as > 99.9% of satellites in our fiducial model are found inside this radius. Even for small halo samples, the three parameters of the fit are independent enough to ensure degeneracies are not a problem. An example is given in Figure 6.1, where we show the best-fit model for all galaxies with stellar masses 10.27 < log10 (M∗ /M( ) < 10.77 in the Guo et al. (2011) semi-analytical model, for five different halo mass bins. The solid lines show the measured number density profiles, while the dashed lines show the best-fit gamma profiles. The halo mass bins are dynamically chosen inside the code such that each contains roughly the same number of galaxies. We use 30 radial bins spaced equally in log-space between log10 x = −2.5 and log10 x = 0.5, and fit an Akima spline through each of the three parameters as a function of halo mass to obtain smooth functions that are stable to outliers. 6.2.1.3 Correction for non-sphericity As is common, we have assumed a spherical distribution of satellite galaxies around each central. In reality, haloes and consequently their galaxy populations are triaxial. van Daalen, Angulo & White (2012) investigated the effect of assuming a spherical distribution on the two-point correlation function and galaxy power spectrum, and found that the effects can be quite large, with the true power being underestimated by 1% around k = 0.2 h Mpc−1 to 10% around k = 25 h Mpc−1 , increasing even more towards smaller scales (see the right panel of their Figure 3, or Figure 4.3 in Chapter 4 of this thesis). We have repeated their analysis and found that the functional shape of this underestimation of the power appears to be completely independent of the mass of the galaxies. We therefore fit a function e(k) through these results and use this to correct our halo model power spectra for the combined effects of non-sphericity. The final galaxy power spectrum that comes out of our model for a given set of galaxies is therefore: 1h 2h Pgal = [Pgal (k) + Pgal (k)]/[1 + e(k)], (6.13) 1h 2h with Pgal (k) and Pgal (k) given by equation (6.5). 6.2.1.4 Converting to the projected correlation function To obtain the projected correlation function from the galaxy power spectrum, we numerically perform two standard transformations. First, to obtain the 3D correlation function: - ∞ 1 sin kr ξ(r) = dk, (6.14) k 2 P (k) 2π 2 0 kr 124 6.2.1 Estimating the correlation function Figure 6.2: The FoF halo mass function, showing number of haloes available in the Millennium Simulation at z = 0 (black) and the number randomly selected as a function of M200mean in each subsample (red). The subsamples each comprise of less than 0.04% of the total halo sample, or 0.14% of the total subhalo sample. The selection function was built iteratively by demanding that ∼ 90% of the random samples it generated lead to projected correlation functions that were within 30% of the full sample prediction. Low-mass haloes were favoured over high-mass haloes in order to suppress the size of the trees used in the SAM. Even so, the fraction of FoF groups needed to match the correlation function within some uncertainty at any stellar mass is higher for more massive haloes. and, finally, to obtain the projected galaxy correlation function: - ∞ - ∞ ': ( rξ(r) : w(rp ) = 2 ξ rp2 + π 2 dπ = 2 dr. 0 rp r2 − rp2 (6.15) Here rp and π are the projected and line-of-sight separation, respectively. It is in this last step that we also convert the units from Mpc/h to Mpc, in order to directly compare our model w(rp ) to that of observations. 6.2.1.5 Selection function The selection function we use to create the halo sample the SAM is repeatedly run on while exploring parameter space was built through use of the following algorithm. At each step, the algorithm adds some number of Friends-of-Friends (FoF) groups to each halo mass bin in turn, and generates a number of random samples for each of the resulting selection functions. The correlation functions predicted using these samples are then compared to determine which mass bin would contribute to the largest reduction in the variance with respect to the full model run 125 Constraining galaxy formation through clustering Figure 6.3: The fractional difference between our model prediction of the projected galaxy correlation function and a direct calculation, for galaxies in the Guo et al. (2011) semi-analytical model. Here we use the full galaxy sample as an input to our model. Results are shown for six different stellar mass bins, indicated by lines of different colours, over the range where SDSS/DR7 data is available for each. The overall agreement is within 20%, with the model tending to overpredict the clustering on sub-Mpc scales. This can be traced to an overestimation of the power in the 1-halo term by a similar amount around k = 1 h Mpc−1 . For our application, our model performs well enough, and we leave improvements to future work. for all six stellar mass bins. If at any step adding more haloes does not reduce the variance for any halo mass, FoF groups are added to a random bin. This continues until at least 90% of the random samples the current selection function generates lead to projected correlation functions that are within 30% of the full sample prediction. In order to suppress the size of the merger trees used in the SAM, low-mass haloes were favoured over high-mass haloes by weighting the number of FoF groups added to each mass bin by the inverse of the average number of subhaloes hosted by FoF groups of that mass. Nonetheless, the fraction of haloes selected at high mass is still higher than at low mass, since more massive haloes potentially contribute more galaxies to the sample, increasing the accuracy of the estimates made in the clustering model described above. Additionally, the most massive galaxies probed here, M∗ > 1011.27 M( , preferentially occupy the most massive haloes. After building several selection functions in this way, we found that on average they were well approximated by the combination of a constant value and a power law (rounded to integer values). This is the near-optimal selection function shown in Figure 6.2 (red line), which takes the constant value Nh = 200 below M200mean = 1012.2 h−1 M( . The subsamples generated by this selection function each comprise less than 0.04% of the total FoF halo sample, or 0.14% of the total subhalo sample. 126 Figure 6.4: The fractional difference between the predictions of our model for 100 halo subsamples and the model prediction for the full sample. Each subsample consists of about 0.14% of the total subhalo sample (see text for details). The colours indicate the same stellar mass bins as in Figure 6.3. Left: The predictions of each of the 100 separate realisations, showing the scatter around the full sample result. Right: Same as the left panel, but now showing only the median, 16th and 84th percentiles. Even using a small random sample, our model can quickly estimate the projected correlation function to ∼ 10% precision. 6.2.1 Estimating the correlation function 127 Constraining galaxy formation through clustering 6.2.1.6 Performance of the model We compare our model prediction of w(rp ), using the full halo sample, to that calculated directly for the galaxies in the Guo et al. (2011) model in Figure 6.3. Here we show the relative difference between the two for six different bins in stellar mass, indicated as ranges in log10 (M∗ /M( ). We only show the results over the range where we constrain w(rp ) using observations. The model performs well, and any deviations from the true correlation function are typically within 20%. The magnitude of the mismatch tends to increase with stellar mass. The largescale disagreement is caused by the model slightly under-predicting the power in the transition region between the 1-halo and 2-halo terms, while the smallscale offset is mostly due to the 1-halo term in the power spectrum being slightly overestimated around k = 1 h Mpc−1 . However, overall the agreement is good, especially considering our relatively simple treatment of e.g. the halo bias (linear and scale-independent), and we leave further improvements – such as using a halohalo power spectrum measured from the dark matter only simulation instead of a biased linear power spectrum – to future work. The true power of the model lies in its ability to reproduce the clustering prediction for the full sample from only a small subsample of FoF groups. In Figure 6.4 we compare the predictions for 100 random subsamples selected according to the selection function shown in Figure 6.2 to the model prediction for the full sample. The dotted lines indicate offsets of 30% for reference, and the colours indicate the same stellar mass bins as in Figure 6.3. The scatter is around 7 − 8% for the first four mass bins, increasing to 10% and 16% for the fifth and sixth mass bin respectively. This shows that the model is capable of reproducing the full sample estimate from relatively few haloes. 6.2.2 The SAM and MCMC As our estimator is able to quickly and accurately recover the projected correlation function from a very small subsample of haloes, this makes it ideally suited for constraining the parameter space of semi-analytical models using the projected correlation function. In this work we present a first application, where we constrain the model of G13, a recent version of the Munich semi-analytical code, using both the galaxy stellar mass function (SMF) and the projected galaxy correlation function. For this we utilise the same data sets as presented in G13. As we will only utilise the Millennium Simulation, and not Millennium II, we only use constraints above M∗ > 109 h−1 M( . The G13 model includes 17 parameters which together determine the outcome of galaxy formation. These are (see Table 6.1): the star formation efficiency (αSF ); the star formation criterion (M̃crit , or equivalently Σcrit ); the star formation efficiency in the burst phase following a merger (αSF,burst ); the slope on the merger mass ratio determining the stellar mass formed in the burst (βSF,burst ); the AGN radio mode efficiency (kAGN ); the black hole growth efficiency (fBH ); the typical halo virial velocity of the black hole growth process (VBH ); three parameters 128 6.2.2 The SAM and MCMC Parameter Description αSF M̃crit αSF,burst βSF,burst kAGN fBH VBH + Vreheat β1 η Veject β2 γ y Rmerger αfriction Star formation efficiency Star formation threshold Star formation burst mode efficiency Star formation burst mode slope Radio feedback efficiency Black hole growth efficiency Quasar growth scale SN mass-loading efficiency Mass-loading scale Mass-loading slope SN ejection efficiency SN ejection scale SN ejection slope Ejecta reincorporation scale factor Metal yield fraction Major-merger threshold ratio Dynamical friction scale factor Units – M( km s−1 Mpc−1 – – h−1 M( yr−1 – km s−1 – km s−1 – – km s−1 – – – – – Table 6.1: Parameters varied in the MCMC. The best-fit values (as well as the G13 values for the WMAP1 cosmology and the prior ranges) are shown in Figure 6.8. For more information we refer to G13. governing the reheating and injection of cold disk gas into the hot halo phase by supernovae, namely the gas reheating efficiency (+), the reheating cut-off velocity (Vreheat ) and the slope of the reheating dependence on Vvir (β1 ); three parameters governing the ejection of hot halo gas to an external reservoir, namely the gas ejection efficiency (η), the ejection cut-off velocity (Veject ) and the slope of the ejection dependence on Vvir (β2 ); a parameter controlling the gas return time from the external reservoir to the hot halo (γ); the yield fraction of metals returned to the gas phase by stars (y); the mass ratio separating major and minor merger events (Rmerger ); and finally a parameter controlling the dynamical friction time scale of orphan galaxies, i.e. the time it takes for satellite galaxies of which the dark matter subhalo is disrupted (or at least no longer detected) to merge with the central galaxy (αfriction ). While in the original G13 paper some of these parameters were held fixed, here we allow all 17 to vary. We start our Monte Carlo Markov Chains (MCMCs) at the position in parameter space used by Guo et al. (2011), which was arrived at by using a combination of SMFs, as well as rest-frame B -band and K -band luminosity functions between z = 0 and z = 3, as observational constraints. We then use the same techniques as described in G13 to find a new set of best-fit parameters, with the projected correlation function as an additional constraint. 129 Constraining galaxy formation through clustering Figure 6.5: The projected galaxy correlation function in six bins of stellar mass. The points with error bars show the SDSS data in each bin, while the lines show the model results. The green dotted line shows the results for the original model from G13, in which the parameter values were set manually. The blue lines show the results of only using the stellar mass function as a constraint, while the red lines show the results when the model is simultaneously constrained by the projected correlation function and the stellar mass function. Finally, dashed and dotted lines are used to indicate whether these are the results for the sample haloes or for all haloes, respectively. The clustering on small scales of the full model is systematically underestimated by the sample, which is mostly due to the clustering estimator (see §6.2.1.6). Note that even though the lowest mass bin is not used as a constraint, the match to observations is markedly improved with respect to the other models. As our model is only accurate to within ∼ 10% on small scales, and additionally since the error bars on the SDSS clustering data were derived from Poisson statistics alone, and so do not include cosmic variance, we artificially increase the error bars on the data points used during the fitting. Each data point of the observed projected correlation function was assumed to have an uncertainty of 20%. As noted before, we do not use the clustering data below M∗ = 109.27 M( , nor the stellar mass function data below M∗ < 109 M( , when constraining the model, as the haloes hosting these galaxies are not well resolved in the original Millennium Simulation which we are using as a basis for the SAM. When fitting to the SMF and clustering data simultaneously, we increase the relative weighting of the fit to 130 6.3 Results the SMF by a factor of five to compensate for the fact that the clustering data is measured in five separate bins. This helps avoid sacrificing the excellent fit to the SMF in favour of matching the correlation function. Note that while G13 used a WMAP7 cosmology, here we use the original WMAP1 cosmology to avoid additional complications introduced by scaling to a different cosmology. In future work the results will be explored for more up-todate cosmologies. Contrary to what is claimed in G13, the change in cosmology has a negligible impact on the resulting correlation functions, which are far more sensitive to the SAM’s physical recipes. Besides updating the cosmology, the only change made from the WMAP1 Guo et al. (2011) model to the newer WMAP7 G13 model is that the type 2 (orphan) satellite galaxy positions are now correctly updated in the code, meaning that their orbits now decay as intended and can therefore be disrupted earlier. This change was the main reason for the improved agreement with clustering data with respect to Guo et al. (2011). 6.3 Results 6.3.1 Comparison with observations The results of our MCMC chains for the projected correlation function are shown in Figure 6.5, for six bins in stellar mass, as indicated in the panels. In each figure, we indicate the original results found by G13, where the galaxy formation parameters were set by hand, as a green dotted line. The new results are shown in blue and red; in blue, we show the correlation functions that follow from only using the stellar mass function as a constraint (“SMF-only”), while in red we show the results of fitting to the clustering data simultaneously (“SMF+w(rp )”). The dashed lines show the predictions made based on the sample of haloes used in the MCMC, as described in §6.2.1. The dotted lines show the true galaxy correlation function, as calculated directly from the full galaxy catalogue for the same model parameters. The true values are generally below the ones estimated from the sample, as expected from the results of §6.2.1.6, and as a consequence the new results tend to under-predict the amount of clustering on the smallest scales. Even so, one immediately sees that the SMF+w(rp ) correlation functions (red lines) generally provide a better fit to the data, bringing the small-scale clustering down considerably in comparison with the original G13 and SMF-only (blue lines) models, which are very close together. This effect is larger for low stellar masses, where the clustering discrepancy between the old model and the data was larger as well. The much improved match to observations indicate that the model retains enough freedom to match the clustering data. Note that the match to the projected galaxy correlation function for galaxies in the first mass bin is greatly improved as well, even though this data is not used to constrain the model. For the highest-mass galaxies, 11.27 < log10 (M∗ /M( ) < 11.77, all models perform equally well, while for galaxies with masses above 1010.27 M( the SMF-only correlation 131 Constraining galaxy formation through clustering Figure 6.6: The stellar mass functions of the models. The green line again refers to the original G13 model, of which the parameters were set manually. The blue lines show the results when the MCMC algorithm is used with only the stellar mass function as a constraint, while the red lines again show the result when clustering constraints are used additionally. When the SMF is the only constraint, the model clearly has enough freedom to reproduce it to high precision. However, the match grows somewhat worse at low mass when the model is additionally constrained by clustering, and is in some places about 2σ away from the combined observational constraints shown in black. Still, the SMF+w(rp ) model performs better in matching both sets of constraints simultaneously. functions perform better on the smallest scales, due to the clustering estimator overestimating the small-scale clustering. However, the improved match to the observed clustering data (at least for lowmass galaxies) comes at a price. In Figure 6.6, we show how the models compare to the SMF data used to constrain the models. The black points with error bars are derived by combining several observational data sets (see G13). The original G13 model, in which the parameters were set by hand, is again shown as a green dotted line, which matches the data well. When we use only the SMF as a constraint for the galaxy formation model, shown in blue, we obtain a marginally better fit to the data at low mass. When the projected galaxy correlation function is used as an additional constraint, shown in red, the agreement with the stellar mass function suffers considerably in favour of the clustering predictions. While the agreement for galaxies with masses M∗ " 1010.5 M( is still comparable to that obtained by G13, the new model over-predicts the number densities of lower-mass galaxies, although the results are still within 2σ of the data. Note that the sample results (dashed lines) agree perfectly with the full catalogue ones (dotted lines) for both SMF-only 132 6.3.1 Comparison with observations Figure 6.7: Comparison of the galaxy distribution profiles for the SMF-only (solid lines) and SMF+w(rp ) (dashed lines) best-fit parameters. The different panels show the profiles of galaxies in the six correlation function mass bins, as indicated in the top right of each panel. As in Figure 6.1, different colours are used for different halo mass bins, which are set to be the same for both models to allow for an unbiased comparison. Note that the mass bins do change as a function of stellar mass in order to make sure each bin in halo mass is roughly equally populated. For clarity, we show only the fits to the measured profiles (see equation 6.12) here, but stress that each provides an excellent fit over the full range shown here. Note that the dynamic range in scales has been extended relative to Figure 6.1 to better appreciate the differences between the profiles. Mainly because of the reduced dynamical friction time scale in the latter model, the profiles of galaxies in every mass bin are slightly flatter at any halo mass, reducing the correlation function on small scales. 133 Constraining galaxy formation through clustering and SMF+w(rp ), which indicates that the discrepancy observed for the correlation function is indeed due to the inaccuracy of the estimator at small separations. The slight mismatch for low-mass galaxies could indicate that the SAM is missing some physical ingredient needed in order to reproduce observations, but other viable explanations also exist. For both the clustering and SMF data the uncertainties may be underestimated; for example, the error bars on the correlation function do not take into account cosmic variance, which could have a quite significant effect. If the observed correlation functions are biased low because of this, the clustering in our model may have been brought artificially low, preventing us from matching the SMF simultaneously. Another possible source of errors could be systematic uncertainties in the observations that lead to samples that are not volume limited. Additionally, changing the cosmology to one that is more up-to-date may help. We will explore some of these possibilities in future work. Note, however, that the SMF+w(rp ) model is in far closer agreement with both the SMF and the clustering data simultaneously than both the original G13 and the SMF-only models: while the latter models are in strong disagreement with the clustering data for low-mass galaxies on small scales, the SMF+w(rp ) model is generally in agreement with both the low-mass clustering data and the SMF within 2σ. This shows the merit of using a clustering estimator while exploring parameter space. 6.3.2 Change in parameters Even though we vary 17 galaxy formation parameters, by far the largest role in bringing the clustering predictions in agreement with observations is played by only two of these: αfriction , which controls the time it takes for satellite galaxies to merge with the central once their dark matter subhalo has been disrupted, and γ, which controls the time it takes for ejected gas to re-enter the halo. The way these parameters influence the clustering and stellar mass function predictions is as follows. When the clustering data is included as an additional constraint, the dynamical friction time scale of orphan galaxies decreases by more than a factor of three with respect to the SMF-only results. This causes galaxies at small separation scales to merge with their centrals much quicker, flattening the galaxy distribution profile within the haloes and greatly decreasing the amount of clustering on small scales, especially for low-mass satellites. This change in the galaxy distribution profiles from the SMF-only to the SMF+w(rp ) model is shown in Figure 6.7. The halo mass bins are set to be the same for both models to allow for an unbiased comparison. Note that the mass bins do change as a function of stellar mass in order to make sure each bin in halo mass is roughly equally populated. Although we only show the fits to the measured profiles here, we stress that each provides an excellent fit to the data, over the full range in scales shown here. The change in slope of the profiles is relatively small, meaning that the galaxy distributions are still consistent with SDSS data for rich clusters (see Figure 14 of Guo et al., 2011). This is because even though the friction time scale 134 Figure 6.8: The preferred parameter values in both models. The best-fit values are shown as dashed vertical lines. The dotted vertical line again shows the result for the original G13 model, and the grey regions indicate values deemed non-physical, and which are therefore made inaccessible to the model. For most parameters the best-fit values are consistent between the model where clustering is not used as a constraint and the model where it is. The largest shifts occur for αSF , M̃crit , y, γ and αfriction . The last two are most important for bringing the clustering data into agreement with data, while the rest mainly serve to preserve the match to the SMF. 6.3.2 Change in parameters 135 Constraining galaxy formation through clustering decreases by more than a factor of three when using clustering as an additional constraint, the number of type 2 galaxies at z = 0 decreases only by a factor 0.87, as the merging time scale for many of these galaxies is still long compared to the Hubble time. Additionally, however, the decrease in the dynamical friction time scale causes the number density of galaxies above the knee (M∗ > 1010.5 M( ) to decrease as well. This counter-intuitive change in the SMF comes about because the cold gas in the merging satellites directly feeds the supermassive black holes in the centres of the central galaxies, increasing feedback from AGN and thereby the suppression of star formation. The γ parameter, on the other hand, increases by more than a factor of five in SMF+w(rp ) with respect to SMF-only, meaning that the hot gas reincorporation time scale decreases by the same factor. This raises the number densities of galaxies at any mass, but most significantly below the knee of the SMF (M∗ < 1010.5 M( ). The change in γ is the main source of the higher low-mass number densities from SMF-only to SMF+w(rp ). The upside is that this parameter shift also lowers the clustering of galaxies, especially for galaxies with masses M∗ > 109.77 M( . While it may seem counter-intuitive to have the number of galaxies at some mass increase while their clustering decreases, keep in mind that it is the (normalised) galaxy distribution within each halo that is driving the clustering prediction, and this distribution flattens when the aforementioned time scales decrease. The parameter changes in γ and αfriction alone, with respect to the best-fit parameters of the SMF-only data, already produce predictions that are very close to those of the SMF+w(rp ) model. While the decrease in the dynamical friction and reincorporation time scales each bring the clustering into better agreement with data separately, a change in both simultaneously is needed as they affect the SMF in different (adverse) ways. We show the shift in parameter values in Figure 6.8. We again indicate the results for all three models: the original G13 model (green dotted lines), the SMF-only model (blue lines), and the SMF+w(rp ) model (red lines). Histograms indicate the Bayesian likelihood regions as derived from the full MCMC chains, while the vertical dashed lines indicate the best-fit values. Both the likelihood regions and the best-fit values of the SMF-only and SMF+w(rp ) models are generally consistent. The largest exceptions to these are the star formation efficiency αSF , the cold gas mass star formation threshold M̃crit , the metal yield y, and the previously mentioned reincorporation scale factor γ and dynamical friction scale factor αfriction . The latter two cause the main decrease in the clustering predictions, needed to bring them in agreement with observations. The significant increase in the star formation efficiency and the decrease in the cold gas mass threshold for star formation, on the other hand, mainly affect the SMF, compensating for the decrease in high-mass galaxies due to the more active AGN, caused in turn by the change in αfriction . Finally, the large change in the metal yield is of little consequence, as this parameter is largely unconstrained by both the SMF and correlation functions. 136 6.4 Summary Figure 6.9: The effect of the changes in the supernova parameters ), Vreheat , β1 , η, Veject and β2 . The mass-loading factor (left panel) goes up slightly when using the correlation function as an additional constraint, but the change is not significant with respect to the 2σ regions allowed (also shown). The same goes for the supernova ejection efficiency (right panel). To show the effect of the changes in the feedback parameters (+, Vreheat , β1 , η, Veject and β2 ), we turn to Figure 6.9. In the left-hand panel, we show the SN mass loading as a function of the maximum virial velocity of the halo, for all three models. We also indicate the 2σ regions allowed by the parameters for the SMF+w(rp ) model. It is clear that while the supernova mass loading increases when the clustering data is used as an additional constraint, the change is not significant. The right-hand panel of Figure 6.9 shows how the SN ejection efficiency changes between the different models. Because the parameter η is significantly higher in the SMF+w(rp ) model with regards to the others, the high-Vmax horizontal asymptote of this function is increased, meaning SNe are more effective at ejecting material for galaxies occupying massive haloes. However, the large 2σ regions again indicate that the constraints used here are not very sensitive to these changes. 6.4 Summary We have developed a fast and accurate clustering estimator, capable of predicting the projected galaxy correlation function for a full galaxy catalogue to within ∼ 10% accuracy using only a very small subsample of haloes (< 0.1% of the total sample). In this work, we have described our estimator and demonstrated its effectiveness for use in constraining parameter space for semi-analytical models of galaxy formation, using the Guo et al. (2013) version of the Munich SAM as a test case. Our estimator determines the halo occupation distribution of galaxies in the subsample and fits a profile to the galaxy distribution within haloes as a function of halo mass, using these quantities in a halo model based approach to determine the 137 Constraining galaxy formation through clustering galaxy clustering of the full sample. By being able to quickly predict the two-point galaxy correlation function for the first time while exploring parameter space, one can use clustering observations to limit the range allowed to the galaxy formation parameters of any SAM, adding constraints complementary to those of one-point functions typically used today, such as the stellar mass or luminosity function. As we have demonstrated, this may lead to different sets of parameters through which the resulting model is able to provide a better match to the observed stellar mass and correlation functions simultaneously. For the G13 model tested here, the improved match to the correlation function is achieved mainly by significantly decreasing the time it takes for stripped (orphan) satellites galaxies to merge with their centrals, as well as the time it takes for gas ejected into the hot halo by feedback processes to be reincorporated. Both changes cause the galaxy distribution profiles within haloes to flatten, lowering the clustering on small scales. Other parameter shifts mainly serve to keep the changes in the SMF caused by the reduced time scales in check. While the use of the clustering estimator presented here clearly has merit, some issues remain to be solved. The estimator tends to over-predict clustering on small scales, leading to final results that tend to fall ∼ 10% below the observational constraints. Improving the model, for example by adding higher-order terms to the linear halo bias currently used, or basing the clustering predictions of galaxies directly on the measured clustering of the haloes in N-body simulations may help. Additionally, the agreement with the SMF could be improved at low mass. We will explore these topics in future work. Acknowledgements The authors thank Joop Schaye for useful discussions and comments on the manuscript. The Millennium Simulation databases used in this chapter and the web application providing online access to them were constructed as part of the activities of the German Astrophysical Virtual Observatory. This work was supported by the Marie Curie Initial Training Network CosmoComp (PITN-GA-2009-238356) and by Advanced Grant 246797 "GALFORMOD" from the European Research Council. 138 References Abadi M. G., Navarro J. F., Fardal M., Babul A., Steinmetz M., 2010, MNRAS, 407, 435 Allgood B., Flores R. A., Primack J. R., Kravtsov A. V., Wechsler R. H., Faltenbacher A., Bullock J. S., 2006, MNRAS, 367, 1781 Angulo R. E., Lacey C. G., Baugh C. M., Frenk C. S., 2009, MNRAS, 399, 983 Angulo R. E., Springel V., White S. D. M., Jenkins A., Baugh C. M., Frenk C. S., 2012, MNRAS, 426, 2046 Angulo R. E., White S. D. M., 2010a, MNRAS, 405, 143 —, 2010b, MNRAS, 401, 1796 Bagla J. S., 2002, Journal of Astrophysics and Astronomy, 23, 185 Bailin J., Steinmetz M., 2005, ApJ, 627, 647 Balaguera-Antolínez A., Porciani C., 2013, J. Cosmology Astropart. Phys., 4, 22 Balogh M. L., Pearce F. R., Bower R. G., Kay S. T., 2001, MNRAS, 326, 1228 Bardeen J. M., Bond J. R., Kaiser N., Szalay A. S., 1986, ApJ, 304, 15 Barnes J., Hut P., 1986, Nature, 324, 446 Barriga J., Gaztañaga E., 2002, MNRAS, 333, 443 Bartko H. et al., 2010, ApJ, 708, 834 Baugh C. M., 2006, Reports on Progress in Physics, 69, 3101 Baugh C. M., Cole S., Frenk C. S., 1996, MNRAS, 283, 1361 Baugh C. M., Lacey C. G., Frenk C. S., Granato G. L., Silva L., Bressan A., Benson A. J., Cole S., 2005, MNRAS, 356, 1191 Behroozi P. S., Conroy C., Wechsler R. H., 2010, ApJ, 717, 379 Berlind A. A., Weinberg D. H., 2002, ApJ, 575, 587 Bett P., Eke V., Frenk C. S., Jenkins A., Helly J., Navarro J., 2007, MNRAS, 376, 215 Bhattacharya S., Heitmann K., White M., Lukić Z., Wagner C., Habib S., 2011, ApJ, 732, 122 BICEP2 Collaboration et al., 2014, Physical Review Letters, 112, 241101 Binggeli B., 1982, A&A, 107, 338 Blumenthal G. R., Faber S. M., Flores R., Primack J. R., 1986, ApJ, 301, 27 Bond J. R., Cole S., Efstathiou G., Kaiser N., 1991, ApJ, 379, 440 Bondi H., Hoyle F., 1944, MNRAS, 104, 273 Booth C. M., Schaye J., 2009, MNRAS, 398, 53 —, 2011, MNRAS, 413, 1158 Bower R. G., Benson A. J., Malbon R., Helly J. C., Frenk C. S., Baugh C. M., Cole S., Lacey C. G., 2006, MNRAS, 370, 645 Boylan-Kolchin M., Ma C.-P., Quataert E., 2008, MNRAS, 383, 93 Boylan-Kolchin M., Springel V., White S. D. M., Jenkins A., Lemson G., 2009, MNRAS, 398, 1150 Bryan S. E., Kay S. T., Duffy A. R., Schaye J., Vecchia C. D., Booth C. M., 2013, MNRAS, 429, 3316 Bullock J. S., Kolatt T. S., Sigad Y., Somerville R. S., Kravtsov A. V., Klypin A. A., Primack J. R., Dekel A., 2001, MNRAS, 321, 559 Carter D., Metcalfe N., 1980, MNRAS, 191, 325 139 Casarini L., Macciò A. V., Bonometto S. A., Stinson G. S., 2011a, MNRAS, 412, 911 —, 2011b, MNRAS, 412, 911 Chabrier G., 2003, PASP, 115, 763 Colberg J. M., White S. D. M., Jenkins A., Pearce F. R., 1999, MNRAS, 308, 593 Cole S., Aragon-Salamanca A., Frenk C. S., Navarro J. F., Zepf S. E., 1994, MNRAS, 271, 781 Cole S., Lacey C. G., Baugh C. M., Frenk C. S., 2000, MNRAS, 319, 168 Cole S. et al., 2005, MNRAS, 362, 505 Colless M. et al., 2001, MNRAS, 328, 1039 Colombi S., Jaffe A., Novikov D., Pichon C., 2009, MNRAS, 393, 511 Conroy C., Wechsler R. H., Kravtsov A. V., 2006, ApJ, 647, 201 Cooray A., Sheth R., 2002, Phys. Rep., 372, 1 Croton D. J., Gao L., White S. D. M., 2007, MNRAS, 374, 1303 Croton D. J. et al., 2006, MNRAS, 365, 11 Cui W., Borgani S., Dolag K., Murante G., Tornatore L., 2012, MNRAS, 423, 2279 Cui W., Borgani S., Murante G., 2014, MNRAS, 441, 1769 Cusworth S. J., Kay S. T., Battye R. A., Thomas P. A., 2014, MNRAS Dalla Vecchia C., Schaye J., 2008, MNRAS, 387, 1431 Davis M., Efstathiou G., Frenk C. S., White S. D. M., 1985, ApJ, 292, 371 De Lucia G., Blaizot J., 2007, MNRAS, 375, 2 Dolag K., Borgani S., Murante G., Springel V., 2009, MNRAS, 399, 497 Dubois Y., Teyssier R., 2008, A&A, 477, 79 Duffy A. R., Schaye J., Kay S. T., Dalla Vecchia C., 2008, MNRAS, 390, L64 Duffy A. R., Schaye J., Kay S. T., Dalla Vecchia C., Battye R. A., Booth C. M., 2010, MNRAS, 405, 2161 Dutton A. A., Macciò A. V., 2014, MNRAS, 441, 3359 Einasto J., 1965, Trudy Astrofizicheskogo Instituta Alma-Ata, 5, 87 Eisenstein D. J., Hu W., 1998, ApJ, 496, 605 —, 1999, ApJ, 511, 5 Eke V. R., Navarro J. F., Steinmetz M., 2001, ApJ, 554, 114 Eriksen H. K., Lilje P. B., Banday A. J., Górski K. M., 2004, ApJS, 151, 1 Fakhouri O., Ma C.-P., 2010, MNRAS, 401, 2245 Faltenbacher A., White S. D. M., 2010, ApJ, 708, 469 Fedeli C., 2014, J. Cosmology Astropart. Phys., 4, 28 Ferland G. J., Korista K. T., Verner D. A., Ferguson J. W., Kingdon J. B., Verner E. M., 1998, PASP, 110, 761 Fu L. et al., 2008, A&A, 479, 9 Gao L., Navarro J. F., Cole S., Frenk C. S., White S. D. M., Springel V., Jenkins A., Neto A. F., 2008, MNRAS, 387, 536 Gao L., Springel V., White S. D. M., 2005, MNRAS, 363, L66 Gao L., White S. D. M., 2007, MNRAS, 377, L5 140 Genel S., Bouché N., Naab T., Sternberg A., Genzel R., 2010, ApJ, 719, 229 Gil-Marín H., Jimenez R., Verde L., 2011, MNRAS, 414, 1207 Giocoli C., Bartelmann M., Sheth R. K., Cacciato M., 2010, MNRAS, 408, 300 Gnedin O. Y., Kravtsov A. V., Klypin A. A., Nagai D., 2004, ApJ, 616, 16 Governato F. et al., 2012, MNRAS, 422, 1231 Grossi M., Verde L., Carbone C., Dolag K., Branchini E., Iannuzzi F., Matarrese S., Moscardini L., 2009, MNRAS, 398, 321 Guillet T., Teyssier R., Colombi S., 2010a, MNRAS, 405, 525 —, 2010b, MNRAS, 405, 525 Guo Q., White S., Angulo R. E., Henriques B., Lemson G., Boylan-Kolchin M., Thomas P., Short C., 2013, MNRAS, 428, 1351 Guo Q. et al., 2011, MNRAS, 413, 101 Guo Q., White S., Li C., Boylan-Kolchin M., 2010, MNRAS, 404, 1111 Guzik J., Seljak U., 2002, MNRAS, 335, 311 Haardt F., Madau P., 2001, in Clusters of Galaxies and the High Redshift Universe Observed in X-rays, D. M. Neumann & J. T. V. Tran, ed. Haas M. R., Schaye J., Booth C. M., Dalla Vecchia C., Springel V., Theuns T., Wiersma R. P. C., 2013, MNRAS, 435, 2931 Hamilton A. J. S., Kumar P., Lu E., Matthews A., 1991, ApJ, 374, L1 Hearin A. P., Zentner A. R., Ma Z., 2012, J. Cosmology Astropart. Phys., 4, 34 Heitmann K., White M., Wagner C., Habib S., Higdon D., 2010, ApJ, 715, 104 Henriques B. M. B., Thomas P. A., Oliver S., Roseboom I., 2009, MNRAS, 396, 535 Henriques B. M. B., White S. D. M., Thomas P. A., Angulo R. E., Guo Q., Lemson G., Springel V., 2013, MNRAS, 431, 3373 Hilbert S., Hartlap J., White S. D. M., Schneider P., 2009, A&A, 499, 31 Hinshaw G. et al., 2013, ApJS, 208, 19 Hopkins P. F., Keres D., Onorbe J., Faucher-Giguere C.-A., Quataert E., Murray N., Bullock J. S., 2013, preprint (arXiv:1311.2073) Hoyle F., Lyttleton R. A., 1939, in Proceedings of the Cambridge Philosophical Society, Vol. 35, Proceedings of the Cambridge Philosophical Society, pp. 405–+ Huterer D., Takada M., 2005, Astroparticle Physics, 23, 369 Jenkins A. et al., 1998, ApJ, 499, 20 Jenkins A., Frenk C. S., White S. D. M., Colberg J. M., Cole S., Evrard A. E., Couchman H. M. P., Yoshida N., 2001, MNRAS, 321, 372 Jing Y. P., Mo H. J., Börner G., 1998, ApJ, 494, 1 Jing Y. P., Suto Y., 2002, ApJ, 574, 538 Jing Y. P., Suto Y., Mo H. J., 2007, ApJ, 657, 664 Jing Y. P., Zhang P., Lin W. P., Gao L., Springel V., 2006, ApJ, 640, L119 Kang X., 2014, MNRAS, 437, 3385 Kang X., Li M., Lin W. P., Elahi P. J., 2012, MNRAS, 422, 804 Kauffmann G., Colberg J. M., Diaferio A., White S. D. M., 1999, MNRAS, 303, 188 Kauffmann G., White S. D. M., Guiderdoni B., 1993, MNRAS, 264, 201 141 Kazantzidis S., Kravtsov A. V., Zentner A. R., Allgood B., Nagai D., Moore B., 2004, ApJ, 611, L73 Kennicutt, Jr. R. C., 1998, ApJ, 498, 541 Khandai N., Di Matteo T., Croft R., Wilkins S. M., Feng Y., Tucker E., DeGraf C., Liu M.-S., 2014, preprint (arXiv:1402.0888) Killedar M., Borgani S., Meneghetti M., Dolag K., Fabjan D., Tornatore L., 2012, MNRAS, 427, 533 Komatsu E. et al., 2011, ApJS, 192, 18 Kravtsov A. V., 1999, PhD thesis, New Mexico State University Kravtsov A. V., Berlind A. A., Wechsler R. H., Klypin A. A., Gottlöber S., Allgood B., Primack J. R., 2004, ApJ, 609, 35 Kravtsov A. V., Klypin A., Hoffman Y., 2002, ApJ, 571, 563 Kravtsov A. V., Nagai D., Vikhlinin A. A., 2005, ApJ, 625, 588 Landy S. D., Szalay A. S., 1993, ApJ, 412, 64 Larson R. B., 1998, MNRAS, 301, 569 Laureijs R., 2009, ArXiv e-prints Le Brun A. M. C., McCarthy I. G., Schaye J., Ponman T. J., 2014, MNRAS, 441, 1270 Levine R., Gnedin N. Y., 2006, ApJ, 649, L57 Lewis A., Challinor A., 2002, Phys. Rev. D, 66, 023531 Lewis A., Challinor A., Lasenby A., 2000, ApJ, 538, 473 Li C., Jing Y. P., Kauffmann G., Börner G., Kang X., Wang L., 2007, MNRAS, 376, 984 Libeskind N. I., Yepes G., Knebe A., Gottlöber S., Hoffman Y., Knollmann S. R., 2010, MNRAS, 401, 1889 Ludlow A. D., Navarro J. F., Angulo R. E., Boylan-Kolchin M., Springel V., Frenk C., White S. D. M., 2014, MNRAS, 441, 378 Ma C., Caldwell R. R., Bode P., Wang L., 1999, ApJ, 521, L1 Ma C.-P., Fry J. N., 2000, ApJ, 543, 503 Macciò A. V., Dutton A. A., van den Bosch F. C., 2008, MNRAS, 391, 1940 Macciò A. V., Moore B., Stadel J., Diemand J., 2006, MNRAS, 366, 1529 Manera M., Sheth R. K., Scoccimarro R., 2010, MNRAS, 402, 589 Martizzi D., Mohammed I., Teyssier R., Moore B., 2014, MNRAS, 440, 2290 Martizzi D., Teyssier R., Moore B., Wentz T., 2012, MNRAS, 422, 3081 Massey R. et al., 2007, ApJS, 172, 239 McCarthy I. G., Schaye J., Bower R. G., Ponman T. J., Booth C. M., Dalla Vecchia C., Springel V., 2011, MNRAS, 412, 1965 McCarthy I. G. et al., 2010, MNRAS, 406, 822 McDonald P. et al., 2006, ApJS, 163, 80 Mead J. M. G., King L. J., Sijacki D., Leonard A., Puchwein E., McCarthy I. G., 2010, MNRAS, 406, 434 Merritt D., Graham A. W., Moore B., Diemand J., Terzić B., 2006, AJ, 132, 2685 Merritt D., Navarro J. F., Ludlow A., Jenkins A., 2005, ApJ, 624, L85 Monaco P., Fontanot F., Taffoni G., 2007, MNRAS, 375, 1189 142 More S., Kravtsov A. V., Dalal N., Gottlöber S., 2011, ApJS, 195, 4 Moster B. P., Somerville R. S., Maulbetsch C., van den Bosch F. C., Macciò A. V., Naab T., Oser L., 2010, ApJ, 710, 903 Muldrew S. I., Pearce F. R., Power C., 2011, MNRAS, 410, 2617 Murray S. G., Power C., Robotham A. S. G., 2013, MNRAS, 434, L61 Navarro J. F., Frenk C. S., White S. D. M., 1997, ApJ, 490, 493 Navarro J. F. et al., 2004, MNRAS, 349, 1039 —, 2010, MNRAS, 402, 21 Neto A. F. et al., 2007, MNRAS, 381, 1450 Paz D. J., Sgró M. A., Merchán M., Padilla N., 2011, MNRAS, 414, 2029 Peacock J. A., Dodds S. J., 1994, MNRAS, 267, 1020 —, 1996, MNRAS, 280, L19 Peacock J. A., Smith R. E., 2000, MNRAS, 318, 1144 Peebles P. J. E., 1980, The large-scale structure of the universe, Peebles, P. J. E., ed. —, 1993, Principles of physical cosmology, Peebles, P. J. E., ed. Pillepich A., Porciani C., Hahn O., 2010, MNRAS, 402, 191 Planck Collaboration et al., 2013, preprint (arXiv:1303.5076) Prada F., Klypin A. A., Cuesta A. J., Betancort-Rijo J. E., Primack J., 2012, MNRAS, 423, 3018 Press W. H., Schechter P., 1974, ApJ, 187, 425 Ragone-Figueroa C., Plionis M., Merchán M., Gottlöber S., Yepes G., 2010, MNRAS, 407, 581 Reed D. S., Bower R., Frenk C. S., Jenkins A., Theuns T., 2007, MNRAS, 374, 2 —, 2009, MNRAS, 394, 624 Reed D. S., Koushiappas S. M., Gao L., 2011, MNRAS, 415, 3177 Refregier A., Amara A., Kitching T. D., Rassat A., 2011, A&A, 528, A33+ Reid B. A. et al., 2010, MNRAS, 404, 60 Romano-Díaz E., Shlosman I., Heller C., Hoffman Y., 2010, ApJ, 716, 1095 Rosswog S., 2009, New Astronomy Review, 53, 78 Rudd D. H., Zentner A. R., Kravtsov A. V., 2008, ApJ, 672, 19 Sánchez A. G., Baugh C. M., Angulo R., 2008, MNRAS, 390, 1470 Sawala T., Frenk C. S., Crain R. A., Jenkins A., Schaye J., Theuns T., Zavala J., 2013, MNRAS, 431, 1366 Scannapieco C. et al., 2012, MNRAS, 423, 1726 Schaye J., 2004, ApJ, 609, 667 Schaye J., Dalla Vecchia C., 2008, MNRAS, 383, 1210 Schaye J. et al., 2010, MNRAS, 402, 1536 Schewtschenko J. A., Macciò A. V., 2011, MNRAS, 413, 878 Schrabback T. et al., 2010, A&A, 516, A63+ Seljak U., 2000, MNRAS, 318, 203 Seljak U., Warren M. S., 2004, MNRAS, 355, 129 Seljak U., Zaldarriaga M., 1996, ApJ, 469, 437 Semboloni E., Hoekstra H., Schaye J., 2013, MNRAS, 434, 148 143 Semboloni E., Hoekstra H., Schaye J., van Daalen M. P., McCarthy I. G., 2011, MNRAS, 417, 2020 Shankar F., Lapi A., Salucci P., De Zotti G., Danese L., 2006, ApJ, 643, 14 Sheth R. K., Jain B., 2003, MNRAS, 345, 529 Sheth R. K., Mo H. J., Tormen G., 2001, MNRAS, 323, 1 Sheth R. K., Tormen G., 1999, MNRAS, 308, 119 Simha V., Cole S., 2013, MNRAS, 436, 1142 Simha V., Weinberg D. H., Davé R., Fardal M., Katz N., Oppenheimer B. D., 2012, MNRAS, 423, 3458 Smargon A., Mandelbaum R., Bahcall N., Niederste-Ostholt M., 2012, MNRAS, 423, 856 Smith R. E., Desjacques V., Marian L., 2011, Phys. Rev. D, 83, 043526 Smith R. E. et al., 2003a, MNRAS, 341, 1311 —, 2003b, MNRAS, 341, 1311 Smith R. E., Reed D. S., Potter D., Marian L., Crocce M., Moore B., 2014, MNRAS, 440, 249 Smith R. E., Scoccimarro R., Sheth R. K., 2007, Phys. Rev. D, 75, 063512 Smith R. E., Watts P. I. R., 2005, MNRAS, 360, 203 Somerville R. S., Hopkins P. F., Cox T. J., Robertson B. E., Hernquist L., 2008, MNRAS, 391, 481 Somerville R. S., Primack J. R., 1998, ArXiv Astrophysics e-prints Spergel D. N. et al., 2007, ApJS, 170, 377 —, 2003, ApJS, 148, 175 Splinter R. J., Melott A. L., Linn A. M., Buck C., Tinker J., 1997, ApJ, 479, 632 Springel V., 2005a, MNRAS, 364, 1105 —, 2005b, MNRAS, 364, 1105 —, 2010, ARA&A, 48, 391 Springel V., Di Matteo T., Hernquist L., 2005, MNRAS, 361, 776 Springel V., Hernquist L., 2003, MNRAS, 339, 289 Springel V. et al., 2008, MNRAS, 391, 1685 —, 2005, Nature, 435, 629 Springel V., White S. D. M., Tormen G., Kauffmann G., 2001, MNRAS, 328, 726 Stadel J., Potter D., Moore B., Diemand J., Madau P., Zemp M., Kuhlen M., Quilis V., 2009, MNRAS, 398, L21 Stanek R., Rudd D., Evrard A. E., 2009, MNRAS, 394, L11 Stinson G., Seth A., Katz N., Wadsley J., Governato F., Quinn T., 2006, MNRAS, 373, 1074 Stott J. P. et al., 2012, MNRAS, 422, 2213 Takahashi R., Sato M., Nishimichi T., Taruya A., Oguri M., 2012, ApJ, 761, 152 Teyssier R., 2002, A&A, 385, 337 Tinker J., Kravtsov A. V., Klypin A., Abazajian K., Warren M., Yepes G., Gottlöber S., Holz D. E., 2008, ApJ, 688, 709 Tinker J. L., Robertson B. E., Kravtsov A. V., Klypin A., Warren M. S., Yepes G., Gottlöber S., 2010, ApJ, 724, 878 144 Tinker J. L., Weinberg D. H., Zheng Z., Zehavi I., 2005, ApJ, 631, 41 Tissera P. B., White S. D. M., Pedrosa S., Scannapieco C., 2010, MNRAS, 406, 922 Tormen G., 1996, in Dark Matter in Cosmology Quantam Measurements Experimental Gravitation, Ansari R., Giraud-Heraud Y., Tran Thanh Van J., eds., p. 207 Tormen G., Bouchet F. R., White S. D. M., 1997, MNRAS, 286, 865 Trotta R., 2008, Contemporary Physics, 49, 71 Valageas P., Nishimichi T., 2011, A&A, 527, A87 Vale A., Ostriker J. P., 2004, MNRAS, 353, 189 van Daalen M. P., Angulo R. E., White S. D. M., 2012, MNRAS, 424, 2954 van Daalen M. P., Schaye J., Booth C. M., Dalla Vecchia C., 2011, MNRAS, 415, 3649 van den Bosch F. C., More S., Cacciato M., Mo H., Yang X., 2013, MNRAS, 430, 725 Velliscig M., van Daalen M. P., Schaye J., McCarthy I. G., Cacciato M., Le Brun A. M. C., Vecchia C. D., 2014, MNRAS, 442, 2641 Vera-Ciro C. A., Sales L. V., Helmi A., Frenk C. S., Navarro J. F., Springel V., Vogelsberger M., White S. D. M., 2011, MNRAS, 416, 1377 Viel M., Becker G. D., Bolton J. S., Haehnelt M. G., 2013, Phys. Rev. D, 88, 043502 Viel M., Haehnelt M. G., Springel V., 2004, MNRAS, 354, 684 Vikhlinin A., Kravtsov A., Forman W., Jones C., Markevitch M., Murray S. S., Van Speybroeck L., 2006, ApJ, 640, 691 Wadsley J. W., Stadel J., Quinn T., 2004, New A, 9, 137 Wang J. et al., 2011, MNRAS, 413, 1373 Warren M. S., Abazajian K., Holz D. E., Teodoro L., 2006, ApJ, 646, 881 Watson W. A., Iliev I. T., D’Aloisio A., Knebe A., Shapiro P. R., Yepes G., 2013, MNRAS, 433, 1230 Wechsler R. H., Zentner A. R., Bullock J. S., Kravtsov A. V., Allgood B., 2006, ApJ, 652, 71 Weinberg D. H., Colombi S., Davé R., Katz N., 2008, ApJ, 678, 6 West M. J., 1989, ApJ, 347, 610 Wetzel A. R., Cohn J. D., White M., Holz D. E., Warren M. S., 2007, ApJ, 656, 139 White M., 2004, Astroparticle Physics, 22, 211 White S. D. M., 1994, ArXiv Astrophysics e-prints White S. D. M., Frenk C. S., 1991, ApJ, 379, 52 Wiersma R. P. C., Schaye J., Smith B. D., 2009, MNRAS, 393, 99 Wiersma R. P. C., Schaye J., Theuns T., Dalla Vecchia C., Tornatore L., 2009, MNRAS, 399, 574 Xu G., 1995, ApJS, 98, 355 Yang X., Kratochvil J. M., Huffenberger K., Haiman Z., May M., 2013, Phys. Rev. D, 87, 023511 145 Yang X., Mo H. J., van den Bosch F. C., 2003, MNRAS, 339, 1057 Zehavi I. et al., 2005, ApJ, 630, 1 Zel’dovich Y. B., 1970, A&A, 5, 84 Zentner A. R., Rudd D. H., Hu W., 2008, Phys. Rev. D, 77, 043507 Zentner A. R., Semboloni E., Dodelson S., Eifler T., Krause E., Hearin A. P., 2013, Phys. Rev. D, 87, 043509 Zhan H., Knox L., 2004, ApJ, 616, L75 Zheng Z. et al., 2005, ApJ, 633, 791 Zhu G., Zheng Z., Lin W. P., Jing Y. P., Kang X., Gao L., 2006, ApJ, 639, L5 Zu Y., Zheng Z., Zhu G., Jing Y. P., 2008, ApJ, 686, 41 146 Nederlandse samenvatting De vorming van sterrenstelsels en de structuur van het Universum1 Als we naar de nachthemel kijken, zijn de meeste objecten die we met het blote oog kunnen zien sterren. Al deze sterren staan kosmisch gezien heel dicht bij ons in de buurt, namelijk binnen ons eigen sterrenstelsel, de Melkweg. Toch is zelfs de dichtstbijzijnde ster na de Zon, Proxima Centauri, op 4,24 lichtjaar afstand2 , voorlopig nog compleet onbereikbaar. De gehele Melkweg is ongeveer 100.000 lichtjaar groot en bevat honderden miljarden sterren, en is zelf slechts één van de honderd miljard sterrenstelsels in het zichtbare Universum. Bijna alle onderwerpen binnen de sterrenkunde liggen dus ontzaglijk ver buiten ons bereik. Desondanks kunnen we heel veel te weten komen over sterren, sterrenstelsels en zelfs het Universum als geheel, dankzij het licht dat we zien en onze kennis van de natuurwetten. In dit proefschrift is het Universum als geheel onderwerp van onderzoek. Er wordt gekeken naar hoe de structuur van het Universum (hoe alle materie verdeeld is binnen het Universum) en de vorming van sterrenstelsels met elkaar in verband staan. Structuurvorming Dankzij het licht dat we zien en onze kennis van de natuurwetten weten we inmiddels dat ons heelal uitdijt en dat die uitdijing momenteel steeds sneller verloopt door wat we donkere energie3 noemen. Verder weten we dat baryonische materie slechts een klein deel is van alle materie in het Universum. Baryonische materie is alle materie die we kunnen zien, zo ook het materiaal waar alles op Aarde van is gemaakt. We kunnen baryonische materie zien doordat het directe interactie met licht heeft: het kan licht uitzenden, absorberen en verstrooien. Dit in tegenstelling tot donkere materie, dat alleen zwaartekracht voelt en voor zover wij weten geen interactie heeft met licht, waardoor het niet direct waarneembaar is. De meeste materie in het Universum is donkere materie. 1 Hoewel we maar bekend zijn met één universum, gaan wij er vanuit dat er meerdere universa bestaan. Binnen de sterrenkunde verwijzen we naar ons universum als “het Universum”. 2 Om enkele vergelijkingen op menselijke maat te geven: dat is ongeveer 40.000.000.000.000 km (40 biljoen kilometer), oftewel een miljard rondjes om de Aarde, of 50 miljoen keer naar de Maan en terug, of 130.000 keer heen en weer naar de Zon. De Voyager 1, gelanceerd in 1977, is het verst van ons verwijderde mensgemaakte object, en zelfs deze ruimtesonde bevindt zich na 37 jaar reizen pas op de rand van ons Zonnestelsel, op 1/21.000e van de afstand tussen ons en Proxima Centauri. 3 Donkere energie is een vreemde eigenschap van de lege ruimte die er kort gezegd voor zorgt dat er meer ruimte komt. Het is alleen verantwoordelijk voor de versnelling van de uitdijing: de uitdijing zelf is een gevolg van de Oerknal. Donkere energie is overal en in gelijke hoeveelheid, maar heeft op Aarde en zelfs binnen sterrenstelsels geen effect, omdat zwaartekracht alles bij elkaar houdt. Sterrenstelsels die ver genoeg uit elkaar staan (en dus nauwelijks elkaars zwaartekracht voelen), lijken echter versneld uit elkaar geduwd te worden door de donkere energie in de tussenliggende ruimte. 147 Nederlandse samenvatting Figuur 8.1: De evolutie van structuur in het Universum. Hier weergegeven is de dichtheid van donkere materie in een vierkant stukje Universum waarvan de zijden nu meer dan 300 miljoen lichtjaar lang zijn. We bewegen bij deze weergave mee met de uitdijing van het Universum. Van zwart naar wit neemt de dichtheid toe. In het vroege Universum (links, 12 miljoen jaar na de Oerknal) zijn de dichtheidsverschillen nog erg klein. Deze groeien langzaam onder invloed van zwaartekracht uit tot een kosmisch web (midden, bijna een miljard jaar na de Oerknal, en rechts, 13,8 miljard jaar na de Oerknal – nu). Waar de dichtheden het hoogst zijn vormen halo’s van donkere materie, waarin sterrenstelsels vormen. Bij de grootste knooppunten in het web vormen clusters van sterrenstelsels. Deze verschillende soorten materie zijn niet gelijk verdeeld over de ruimte, op sommige plaatsen is de dichtheid hoger dan op andere plaatsen. Omdat zwaartekracht dichtheidsverschillen versterkte, waren deze in het vroege Universum dus kleiner dan nu. Vanuit minieme dichtheidsverschillen aan het begin van het Universum, 13,8 miljard jaar geleden, is langzaam onder invloed van zwaartekracht een kosmisch web van donkere materie gevormd (zie Figuur 8.1). De baryonische materie, die in het begin bijna uitsluitend bestond uit waterstof en helium4 , volgt de zwaartekracht van donkere materie. Doordat het gas wel interactie heeft met licht, kan het energie kwijt door licht uit te zenden (“gaskoeling”). Hierdoor kan baryonische materie nog hogere dichtheden bereiken dan donkere materie, waardoor kernfusie mogelijk wordt en sterren gevormd kunnen worden. Dit gebeurt daar waar de concentratie van donkere materie (en dus de zwaartekracht) het hoogst is, in “halo’s” van donkere materie. Hoe meer gas er naar het centrum van een halo stroomt, hoe meer sterren er kunnen vormen en hoe groter het sterrenstelsel wordt. Onder invloed van zwaartekracht kunnen halo’s (en hun sterrenstelsels) met elkaar versmelten om zo nog groter te worden, een proces dat we “merging” noemen. Sommige halo’s zijn groot genoeg om meerdere sterrenstelsels te bevatten. Verzamelingen van sterrenstelsels die dezelfde halo bewonen, noemen we “groepen” (enkele tot tientallen sterrenstelsels) en “clusters” (honderden tot duizenden sterrenstelsels, zie Figuur 8.2, links). 4 Alle zwaardere elementen, inclusief koolstof, zuurstof en ijzer, zijn gemaakt in sterren en verspreid door supernovae en stellaire winden. 148 Feedback Net zoals de verdeling van materie bepaalt waar en hoe sterrenstelsels vormen, beïnvloedt de vorming en evolutie van sterrenstelsels andersom ook de verdeling van materie. De halo’s van donkere materie reageren op de vorming van sterrenstelsels in hun centra door meer samen te trekken. Maar dat niet alleen: de vorming van sterrenstelsels gaat namelijk gepaard met veel geweld en heeft invloed op de verdeling van (alle) materie. Als sterren sterven gebeurt dat meestal stilletjes, maar hele zware sterren (tenminste 8x zo zwaar als de Zon) kunnen exploderen als supernovae. Bij een supernova kan gas weg worden geslingerd (wat supernova-feedback wordt genoemd), waardoor de verdeling van het gas wordt beïnvloed. Het gas kan hierbij ook verhit worden, wat de vorming van nieuwe sterren tegengaat: het gas moet deze energie namelijk eerst kwijtraken voordat de dichtheid ervan hoog genoeg kan worden voor kernfusie. Soortgelijke processen gebeuren op veel grotere schaal bij de supermassieve zwarte gaten in de centra van sterrenstelsels. Gas kan hierbij zelfs tot ver uit het sterrenstelsels worden gedreven. Ook hierbij wordt het gas verhit, soms tot zulke hoge temperaturen dat het miljarden jaren kan duren voordat het gas deze energie kwijt is en weer structuur kan vormen. De donkere materie kan op het uitstoten van grote hoeveelheden gas reageren door uit te zetten. Een supermassief zwart gat dat sterke interactie vertoont met het gas eromheen wordt een AGN (Active Galactic Nucleus) genoemd. Het proces waarbij gas wordt verhit en naar buiten wordt gedreven, heet AGN-feedback (zie Figuur 8.2, rechts). AGN-feedback en supernova-feedback komen veel terug in de verschillende hoofdstukken van dit proefschrift, niet alleen omdat ze de verdeling van materie beïnvloeden maar ook omdat hun effect groter is dan voorheen werd aangenomen. Numerieke simulaties Omdat de sterrenstelsels alleen vormen waar de dichtheden het hoogst zijn, zijn ze “biased tracers” van de algehele materieverdeling. Dit houdt in dat ze ons een gekleurd beeld geven van waar alle materie zich bevindt. Mede door gedetailleerde computersimulaties uit te voeren, leren we steeds beter hoe de totale materieverdeling en de verdeling van sterrenstelsels zich tot elkaar verhouden. Met deze kennis kunnen we door het nauwkeurig bestuderen van hoe sterrenstelsels in het Universum verdeeld zijn dus steeds meer leren over de structuur van het Universum als geheel. Er zijn verschillende manieren om de relatie tussen de vorming van sterrenstelsels en de verdeling van materie te modelleren. Omdat deze allemaal terugkomen in dit proefschrift, worden ze hieronder kort beschreven. 149 Nederlandse samenvatting Figuur 8.2: Links: Voorbeeld van een cluster van sterrenstelsels, Abell 2744. Veel van de sterrenstelsels in dit plaatje (niet die op de achtergrond) bewonen dezelfde halo, en draaien om elkaar heen. De sterrenstelsels zelf, het enige dat we op dit plaatje kunnen zien, bevatten slechts 5 procent van de materie: 20 procent zit in het gas tussen de sterrenstelsels in, dat zo heet is dat het bijna uitsluitend röntgenstraling uitzendt. De overige 75 procent van de massa is donkere materie. Rechts: Voorbeeld van AGN-feedback in het sterrenstelsel NGC1275. Dit sterrenstelsel zit in het midden van de Perseus cluster (Abell 426). Het bolvormige witte licht in het midden komt van het sterrenstelsel zelf. De enorme paarse wolk eromheen laat de röntgenstraling van het door AGN-feedback uitgestote en verhitte gas eromheen zien. Dit gas heeft een temperatuur van tientallen miljoenen graden. Hydrodynamische simulaties Ten eerste kan men kosmologische, hydrodynamische simulaties uitvoeren: computerberekeningen waarbij een significant deel van het Universum wordt gesimuleerd, met zowel donkere materie als gas. Hierbij worden alle relevante natuurkundige vergelijkingen, zoals die voor zwaartekracht en gaskoeling, doorgerekend vanaf het vroege Universum tot nu. Hoe krachtiger de computer, hoe meer deeltjes de simulatie kan bevatten, en daarmee hoe fijner de resolutie wordt. Momenteel zijn we nog niet in staat om in dergelijke kosmologische simulaties processen zoals stervorming en supernovae direct te simuleren, omdat daar veel hogere resolutie voor nodig is dan zelfs de beste supercomputers van vandaag de dag kunnen leveren.5 Daarom maken we gebruik van “sub-grid recepten”: formuleringen die voorschrijven hoe processen op kleinere schaal dan we direct kunnen simuleren afhangen van eigenschappen op grotere schaal (bijvoorbeeld hoe de hoeveelheid nieuwe sterren die gevormd wordt per eenheid tijd binnen één gasdeeltje afhangt van de dichtheid en temperatuur gemeten rond het gasdeeltje). Wat uit de simulatie komt hangt dus voor een groot deel af van wat we er zelf instoppen. Het is derhalve belangrijk dat de sub-grid recepten gebaseerd zijn op de natuurwetten en op wat de waarnemingen in het echte Universum ons vertellen. Verder is het belangrijk dat we een beschrijving hebben van elk proces dat we niet direct kunnen 5 Om een idee te geven: in de hydrodynamische simulaties die in dit proefschrift worden beschreven, is elk deeltje ongeveer zo zwaar als 100 miljoen zonnen. 150 simuleren: als wij de simulatie niet vertellen dat er iets zoals een AGN bestaat en hoe deze zich gedraagt, dan zal de simulatie het verkeerde antwoord geven op plekken waar AGN belangrijk zijn. N-body simulaties Ook kunnen simulaties worden uitgevoerd onder de aanname dat alle materie donkere materie is (“N-body” simulaties). Zulke simulaties zijn een stuk simpeler en kunnen met een veel hogere resolutie worden uitgevoerd dan hydrodynamische simulatie: immers, de enige vergelijking die doorgerekend hoeft te worden is zwaartekracht. Omdat donkere materie de dominante vorm van materie is, en zwaartekracht het dominante proces bij structuurvorming, kunnen we met zulke simulaties nog steeds veel leren over de verdeling van materie in het echte Universum. We moeten echter in gedachten houden dat de effecten van baryonische processen op de donkere materie (zoals gaskoeling en feedback) hierbij worden verwaarloosd. Het hangt af van de schaal waar naar gekeken wordt of deze processen significant zijn. Semi-analytische modellen Verder is het mogelijk om te bestuderen hoe sterrenstelsels gevormd zouden zijn in de donkeremateriehalo’s van N-body simulaties met behulp van semi-analytische modellen (“SAMs”). Hierbij wordt aangenomen dat de groei en evolutie van sterrenstelsels volledig bepaald wordt door de eigenschappen van de donkere materie. Een semi-analytisch model kan gezien worden als een collectie in elkaar hakende sub-grid recepten: niets wordt direct gesimuleerd behalve de donkere materie. Het voordeel van dergelijke “simulaties” is dat ze erg snel uitgevoerd kunnen worden, waardoor het makkelijk wordt om het effect van verschillende voorschriften voor de vorming van sterren etcetera te testen. Wel kennen dergelijke simulaties meer beperkingen dan hydrodynamische simulaties, aangezien de donkerematerieverdeling vast staat. HOD- en halomodellen Tot slot is het mogelijk om nog een stap verder af te wijken van het doen van directe simulaties, door de halo’s voor te stellen als bolvormige objecten met een bepaalde verdeling in de ruimte en een bepaald dichtheidsprofiel (meestal gebaseerd op de resultaten van donkeremateriesimulaties). Voorbeelden hiervan zijn halomodellen en HOD-modellen (“halo occupation distribution”), waarmee de verdeling van respectievelijk materie en sterrenstelsels relatief snel en simpel voorspeld kan worden, tot op een zekere nauwkeurigheid. Elk van deze methodes heeft zijn eigen voor- en nadelen. Directe, hydrodynamische simulaties zijn het meest complex en kunnen daardoor de meeste verschillende processen en effecten bevatten. Ze kosten echter ook de meeste computertijd, 151 Nederlandse samenvatting en worden daarom vaak maar eenmaal gedraaid. Modellen die verder afwijken van directe simulaties zijn minder precies, maar kunnen sneller tot een uitkomst leiden, wat ook als voordeel heeft dat hetzelfde model vele malen opnieuw gedraaid kan worden met kleine variaties in de vrije parameters (bijvoorbeeld in de kosmologische parameters die ons Universum karakteriseren). Elk van de verschillende uitkomsten kan dan met de waarnemingen vergeleken worden, om zo te kijken welke parameters de werkelijkheid het beste beschrijven. Hierbij moet echter wel rekening worden gehouden met de genomen benaderingen (zoals dat de baryonische materie en feedback de donkere materie niet beïnvloeden). Clustering Als we de werkelijkheid en simulaties met elkaar willen vergelijken, dan hebben we daar kwantificeerbare grootheden voor nodig: meetbare eigenschappen waar we een getal aan kunnen verbinden. We kunnen bijvoorbeeld kijken naar de hoeveelheid sterrenstelsels met een bepaalde massa in sterren, of die een bepaalde hoeveelheid licht uitzenden. In dit proefschrift ligt de nadruk zoals gezegd op de verdeling van sterrenstelsels en materie in de ruimte, wat we “clustering” noemen. We kunnen de hoeveelheid clustering op een bepaalde schaal kwantificeren met behulp van de correlatiefunctie en het “power spectrum”.6 De correlatiefunctie van sterrenstelsels geeft de waarschijnlijkheid dat twee sterrenstelsels zich op een bepaalde afstand van elkaar bevinden, ten opzichte van een willekeurige verdeling. Door in waarnemingen en simulaties van duizenden sterrenstelsels de onderlinge afstanden te bepalen, kunnen we de correlatiefunctie berekenen door simpelweg te tellen hoe vaak sterrenstelsels op een bepaalde afstand van elkaar staan. Als de correlatiefunctie positief is op een bepaalde schaal (dus voor een bepaalde onderlinge afstand), betekent dit dat de sterrenstelsels “graag” op deze afstand van elkaar zitten. De correlatiefunctie neemt sterk toe naar kleinere schalen, wat betekent dat sterrenstelsels heel vaak dicht bij elkaar te vinden zijn, wat past bij het beeld van een kosmisch web waarbij de sterrenstelsels vormen waar de dichtheden het hoogst zijn. Het power spectrum is iets ingewikkelder, maar kort gezegd gebruiken we het in dit proefschrift om de clustering van (alle) materie te karakteriseren. In de simulatie is dit makkelijk te meten, omdat we precies weten waar alle materie zich bevindt, maar in het echte Universum is dit wat lastiger. We kunnen immers alleen het licht van sterrenstelsels en gas direct waarnemen. Gelukkig zijn we steeds beter in staat om de verdeling van alle materie, baryonisch en donker, in kaart te brengen, dankzij waarnemingen van “lensing”: het effect dat licht van sterrenstelsels een klein beetje wordt afgebogen onder invloed van zwaartekracht. Door deze afbuigingen heel precies in kaart te brengen kan het zwaartekrachtsveld (en daarmee de verdeling van materie) gereconstrueerd worden en kan een power 6 De Nederlandse vertaling hiervan is de “spectrale vermogensdichtheidsfunctie”, maar dit rolt toch wat minder goed van de tong. Overigens mist de Nederlandse taal helaas een even bondige doch sterk beschrijvende vertaling voor het woord “clustering”. 152 spectrum gemeten worden. Door de correlatiefuncties en power spectra van waarnemingen en simulaties te vergelijken, kunnen we meer leren over ons Universum. We moeten hierbij wel zorgvuldig zijn met onze simulaties: als we mogelijk belangrijke processen zoals feedback negeren, gaat de vergelijking met de werkelijkheid niet op en kloppen onze interpretaties van de waarnemingen misschien niet. In dit proefschrift richten we ons daarom voornamelijk op de effecten die de vorming van sterrenstelsels kan hebben op de clustering van materie en sterrenstelsels zelf. In dit proefschrift Hieronder volgt een vereenvoudigde samenvatting van de inhoud van dit proefschrift. In Hoofdstuk 1 geef ik een uitgebreidere introductie van de studie van het Universum als geheel: kosmologie. Ik beschrijf in meer detail hoe kleine dichtheidsverschillen in het vroege Universum groeien onder invloed van zwaartekracht. Ook ga ik dieper in op de rol die de vorming van sterrenstelsels speelt in de algehele verdeling van materie. In Hoofdstuk 2 onderzoeken we de effecten die verscheidene fysische processen verwant aan de vorming en evolutie van sterrenstelsels – waaronder supernovaen AGN-feedback – kunnen hebben op de clustering van materie. We vergelijken daarbij hydrodynamische simulaties met N-body simulaties en we wisselen af welke fysische processen in overweging worden genomen, zodat we het effect van elk afzonderlijk proces kunnen testen. We laten hierbij zien dat feedback een veel grotere invloed op het power spectrum kan hebben dan in voorgaande studies is aangetoond. We onderzoeken ook hoe de clustering van de donkere materie hierbij verandert. We concluderen dat het nodig is om processen zoals feedback in acht te nemen, omdat dit grote invloed kan hebben op hoe goed we in staat zullen zijn om de nauwkeurige clusteringdata die in de nabije toekomst verkregen zal worden te interpreteren. Omdat dergelijke fysische processen ook belangrijk zouden kunnen zijn voor de clustering van sterrenstelsels, beschouwen we in Hoofdstuk 3 de relevante correlatiefuncties. Ook hier treden significante veranderingen op als feedback in overweging wordt genomen. Deze veranderingen komen voornamelijk tot stand doordat de massa’s van de sterrenstelsels en hun halo’s door feedback afneemt, maar er zijn daarnaast ook kleinere, complexere effecten die een rol spelen. We laten zien dat het belangrijkste van deze effecten de herverdeling van materie door feedback is. We onderzoeken in dit proefschrift ook de geldigheid van enkele aannames waar in HOD- en halomodellen gebruik van wordt gemaakt. In Hoofdstuk 4 bekijken we hoe de aanname dat halo’s van donkere materie bolvormig zijn de voorspelde clustering van sterrenstelsels kan beïnvloeden. We maken hierbij gebruik van semi153 Nederlandse samenvatting analytische modellen om het verschil tussen het gebruik van realistische halo’s of kunstmatige, ronde halo’s te bestuderen. Met de aanname dat halo’s rond zijn, kan de correlatiefunctie van sterrenstelsels sterk onderschat worden op kleine schaal. Het is daarom belangrijk om realistische vormen in acht te nemen, al laten we zien dat de oriëntatie van de halo’s weinig verschil maakt. We laten ook zien dat het verschil in clustering waar de vorm van de halo’s voor zorgt, meetbaar zou moeten zijn in het echte Universum. Met gebruik van N-body simulaties toetsen we vervolgens in Hoofdstuk 5 de aanname van halomodellen dat alle materie in halo’s zit. We doen dit door de clustering van materie te berekenen en die te vergelijken met de clustering van alleen de materie die in halo’s zit. We laten zien dat materie buiten de halo’s ook belangrijk kan zijn voor de clustering, afhankelijk van de halomassa’s die in beschouwing worden genomen en de gebruikte definitie van een halo. Tot slot presenteren we in Hoofdstuk 6 een snelle en nauwkeurige computercode voor het schatten van de correlatiefunctie van sterrenstelsels in semianalytische modellen. De correlatiefunctie voor de gehele simulatie wordt hierbij geschat gebruik makende van slechts 1 op de 1.000 sterrenstelsels. Hiermee kan de clustering veel sneller bepaald worden (met een bekende onzekerheid) dan door deze direct te berekenen. Daardoor wordt het mogelijk om op efficiënte wijze de parameters van het semi-analytische model te vinden die de clustering in het echte Universum het beste reproduceren. We demonstreren dit in hetzelfde hoofdstuk door onze methode toe te passen in een semi-analytisch model. 154 Publications 1. The clustering of baryonic matter. II: halo model and hydrodynamic simulations Cosimo Fedeli, Elisabetta Semboloni, Marco Velliscig, Marcel P. van Daalen, Joop Schaye, Henk Hoekstra 2014, accepted by JCAP, arXiv:1406.5013 2. The impact of galaxy formation on the total mass, profiles and abundance of haloes Marco Velliscig, Marcel P. van Daalen, Joop Schaye, Ian G. McCarthy, Marcello Cacciato, Amandine M. C. Le Brun, Claudio Dalla Vecchia 2014, MNRAS, 442, 2641 3. The impact of baryonic processes on the two-point correlation functions of galaxies, subhaloes and matter Marcel P. van Daalen, Joop Schaye, Ian G. McCarthy, C. M. Booth, Claudio Dalla Vecchia 2014, MNRAS, 440, 2997 4. The effects of halo alignment and shape on the clustering of galaxies Marcel P. van Daalen, Raul E. Angulo, Simon D. M. White 2012, MNRAS, 424, 2954 5. Quantifying the effect of baryon physics on weak lensing tomography Elisabetta Semboloni, Henk Hoekstra, Joop Schaye, Marcel P. van Daalen, Ian G. McCarthy 2011, MNRAS, 417, 2020 6. The effects of galaxy formation on the matter power spectrum: A challenge for precision cosmology Marcel P. van Daalen, Joop Schaye, C. M. Booth, Claudio Dalla Vecchia 2011, MNRAS, 415, 3649 155 Curriculum vitae I was born on January 8 1986 in The Hague. Around the age of 5 I realised I wanted to go into science, not yet knowing how broad a term “science” was and that it might therefore be useful to think about what kind of science I wanted to do. All I knew back then is that I wanted to discover and learn things that nobody knew yet. After exasperating my parents with endless “how’s” and “why’s” about how the world worked all through elementary school, my interests began to point more towards space. During my time at the Alfrink College high school I especially loved mathematics and physics, and – inspired by the great math teachers I had – briefly debated studying and then teaching mathematics. I was – and still am – also fascinated with computers and technology. However, the idea of “unravelling the Universe” won out, and in 2004 I started studying Astronomy at Leiden University. It only took one programming class in my first year to get me hooked, and I haven’t bored of programming since – a good thing, considering my preference for the theoretical side of astronomy. After a Bachelor’s project with Clovis Hopman on binary star break-up in the Galactic centre and a minor Master’s project with Jarle Brinchmann on gamma-ray bursts from Wolf-Rayet stars, I started working on the effects of galaxy formation on clustering with Joop Schaye. After obtaining my Master’s degree (cum laude) I continued working on the interface of galaxy formation and cosmology throughout my PhD project, working with Joop Schaye and with Simon White at the Max-Planck-Institut für Astrofysik in Garching, Germany. After having spent the first half of my PhD living in Munich, I moved back to Leiden, travelling back and forth between the two places during my PhD. The love for teaching that I had acquired in high school has remained, and during my studies and doctorate I very much enjoyed tutoring first-year students, assisting in courses, answering astronomy-related questions from the public as member of the outreach committee, and participating in the popularisation of astronomy through outreach at high schools and giving public lectures. In April I married my wonderful wife Marieke, and in September we will move across the Atlantic to start my first position as a post-doctoral researcher, at the University of California in Berkeley, USA. I plan to remain there for three years, working as a TAC Fellow. 157 Acknowledgements In the last four years there have been a lot of people that helped me along the way and/or made my time in Leiden and Munich more enjoyable, and to whom I would like to express my gratitude. First of all, I’m grateful for the support of my parents in my choice of career, even when they didn’t always understand it. I’m also grateful to the teachers who inspired me, especially Anton Dolle (my math teacher in the final years of high school) and Walter Kosters (who taught me the joys of C++). A big thanks goes to all my colleagues at Leiden Observatory and MPA. Thanks to you, both institutes have great atmospheres that are enjoyable to work in. Clovis and Jarle, thank you for taking me through my first two research experiences and writing me many letters of recommendation (even after one of you left astronomy). Had I been allowed to thank my current supervisors (Joop and Simon) here, I would have said that I greatly appreciate the time and effort they put in mentoring me, sharing their expertise and making me a better scientist. Xander, thanks for keeping an eye on me when it seemed like I might not finish my thesis on time. Thanks to all members (past and current) of Joop’s awesome OWLS research group, which I’m happy to have been a part of for the last five years. Your company and the twice-weekly discussions all contributed greatly to my time at the Sterrewacht. Craig, Claudio and Rob C., thanks for always making time for questions from us students. Olivera, I always enjoyed our conversations, and practising your Dutch. Ali, it was fun having you around, both here and in Munich. Marco and Marcello, it was great working with you, and I hope we can continue to do so in the future. Marco, I also very much enjoyed complaining about subfind with you. Ben, Milan, Monica, Alex, Joki and Marijke, thank you for making the group more fun as well. Freeke, you were the best office mate anyone could wish for, and it was great having you around in Munich too; Ann-Marie, you were an awesome neighbour (twice). See you both in Berkeley! I would also like to thank the people I worked with in Garching. Chervin and Rob Y., thanks for all the laughs. Ben and Laura, thanks for being so much fun. Raul and Bruno, it was awesome working with you, and I learned a lot. The support staff at both Leiden Observatory and MPA is wonderful, and they’ve helped me with a lot of issues over the years. Evelijn and Arianne, thanks for all your help and enjoyable conversations, both at the physics department and at the Sterrewacht. Also a big thanks to the computer group at Leiden, Erik, David, Tycho, Niels and Aart, for their support (which I’ve made use of many, many times). I’m also extremely grateful to Jeanne, Anita and Liesbeth in Leiden, and Maria, Gabi and Cornelia at MPA, who helped me with many problems and questions along the way. Finally, I would like to thank my amazing wife Marieke. Not only for her help with turning my propositions into actual propositions and making my Dutch summary more understandable (compounded by my stubbornness), but most of all for her undying love and support. Even when I moved to a different country. Without her, this thesis would have lost its meaning. 159

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Download PDF

advertisement