EDP Sciences
Press Release
Free Access
Issue
A&A
Volume 598, February 2017
Article Number A58
Number of page(s) 18
Section Galactic structure, stellar clusters and populations
DOI https://doi.org/10.1051/0004-6361/201629990
Published online 30 January 2017

© ESO, 2017

1. Introduction

The hierarchical paradigm of structure formation predicts stellar halos to be the most important repositories of merger debris (Helmi & White 1999). Relics of accretion events may exist in the form of spatially coherent streams (Bullock & Johnston 2005; Cooper et al. 2010; Helmi et al. 2011) or of moving groups of stars for events that happened a long time ago (Helmi & White 1999). Data from wide area surveys such as SDSS and 2MASS have revealed much spatial substructure and painted a picture in which the outer halo (beyond approximately 20 kpc from the Galactic centre) was likely built solely via mergers (Newberg et al. 2002; Majewski et al. 2003; Belokurov et al. 2006; Deason et al. 2015). On the other hand, the assembly process of the inner halo, where most of the halo stars reside, is still unknown. Genuine halo streams crossing the Solar neighbourhood were in fact discovered almost two decades ago (Helmi et al. 1999). The granularity of the nearby halo has been estimated by Gould (2003) who determined that streams, if present, should each contain less than 5% of the stars. These estimates are consistent with later discoveries of substructures (Kepley et al. 2007; Morrison et al. 2009; Smith et al. 2009; Klement 2010).

Whether or not streams constitute only a minority of the halo is therefore still under scrutiny. However, models predict that if solely built via accretion, the stellar halo in the Solar neighbourhood should contain 300–500 streams originating mostly from a handful of massive progenitors (Helmi & White 1999; Helmi et al. 2003; Gómez et al. 2013). Such a halo would appear spatially very well-mixed and its granularity would only be truly apparent in local samples with accurate kinematics with at least ten times as many stars. Such large samples of stars with accurate full phase-space information have yet to become available.

On the other hand, an important fraction of the inner halo may have formed in-situ, either from a heated disk during merger events (Cooper et al. 2015), or from the gas during early collapse (Eggen et al. 1962; Zolotov et al. 2009). The idea of a “dual” nature of the stellar halo has gained momentum from the kinematics and metallicities of stars near the Sun (Carollo et al. 2007, 2010). A common explanation for this inner versus outer halo duality is different formation paths, namely in-situ versus accretion origins (Tissera et al. 2014).

The question of whether the halo was solely built by accretion or by more than one mechanism will most likely be answered with Gaia data. The Gaia mission was launched by ESA in December 2013 and will collect data for a period of at least five years. Its first data release, DR1, that became available on the 14th September, 2016, contains the positions on the sky and G-magnitudes for over a billion stars. It also provides the proper motions and parallaxes for over 2 million Tycho-2 sources, in what is known as the Tycho-Gaia Astrometric Solution (TGAS; Gaia Collaboration 2016; Lindegren et al. 2016).

To make progress on the accretion history of the Milky Way at this point in time, we must partially rely on ground-based efforts to obtain the required full phase-space information of halo stars. Several large spectroscopic surveys have been carried out in the past decade, and the one that best matches TGAS in terms of extent and magnitude range is RAVE (Steinmetz et al. 2006). The RAVE survey has obtained spectra for more than 500 000 stars in the magnitude range 9 <I< 12. Its last data release, DR5 (Kunder et al. 2017), provides radial velocities for over 400 000 independent stars, as well as astrophysical parameters for a good fraction of these objects.

We take advantage of the powerful synergy between TGAS and RAVE to construct a high-quality dataset that allows us to explore the formation and structure of the stellar halo. The results presented in this paper may be considered as an appetiser for what can be expected, as well as a teaser of the challenges to come when the next Gaia data release becomes available.

This paper is organised as follows. In Sect. 2, we introduce the dataset obtained by cross-matching TGAS to RAVE, from which we construct a halo sample based on metallicity. In Sect. 3, we study the distribution of this sample of halo stars in phase-space. To establish the presence of streams, we compute the velocity correlation function in Sect. 3.1, while in Sect. 3.2, we identify the presence of several statistically significant substructures in the space of integrals of motion, that is, defined by energy and two components of the angular momentum. In Sect. 4, we discuss our findings and make quantitative comparisons to cosmological simulations of the formation of galaxies such as the Milky Way. We present our conclusions in Sect. 5.

2. Data

2.1. TGAS and RAVE

2.1.1. General description of the dataset

The Gaia satellite has allowed a new determination of the astrometric parameters (proper motions and parallaxes) for Tycho-2 stars (Høg et al. 2000) by taking advantage of the large time baseline between 1991.5 and 2015.0, that is, between the Tycho epoch and the measurements obtained during approximately the first year of science operations by the Gaia mission (Lindegren et al. 2016). As part of the Gaia DR1 release, the astrometric parameters for ~2 × 106 stars (80% of the Tycho-2 dataset; those with TGAS parallax error smaller than 1 mas) with magnitudes 6 ≲ G< 13 have been made publicly available to the community (Gaia Collaboration 2016). Most of these stars are within a few kpc from the Sun, while a few objects exist at distances of ~50 kpc, such as supergiants in the Large Magellanic Cloud.

The RAVE survey has obtained spectra for ~5 × 105 stars in the southern hemisphere since its start in 2003, from which radial velocities have been derived (Steinmetz et al. 2006). If the spectrum is of sufficient quality (S/N ≥ 20, Kordopatis et al. 2013a) then astrophysical parameters such as gravity, temperature and metallicity, and hence absolute magnitude and distance, can be estimated.

We have made our own cross-match between TGAS and RAVE DR5, and found 210, 263 stars in common. Although this is not a very large sample (only 10% of the full size of TGAS), it constitutes a very good starting point to explore the dynamics and kinematics of stars near the Sun. This is particularly true if we compare to what has been possible with Hipparcos or Tycho-2 in combination with the Geneva Copenhagen Survey (Nordström et al. 2004) or even with RAVE, for example. Of this cross-matched set, 203 992 stars have spectra with velocity error ϵRV ≤ 10 km s-1 and CorrCoeff ≥ 10, indicating a good measurement of the radial velocity. If we further consider only stars that i) satisfy the S/N ≥ 20 criterion and the flag algoConv 1 that ensures a reliable determination of astrophysical parameters; and ii) have a relative parallax error 30% (be it from TGAS or RAVE), the sample is reduced to 170,509 objects. For 29.5% of the stars in this set, we use the RAVE parallaxes because they have a smaller relative error than those in TGAS. As a side note, we remark that our results do not change significantly if we use the parallaxes with the smaller absolute error between TGAS and RAVE instead.

Although we have imposed a relative parallax error cut to have a good quality dataset, the parallaxes of TGAS and RAVE could still be subject to (different) systematic errors. In particular, Binney et al. (2014) have performed a very careful comparison of the parallaxes derived by the RAVE pipeline to those derived using alternative methods. These included the parallaxes obtained by Hipparcos, the distances to open clusters derived with giants or with dwarfs, and the kinematic bias correction method of Schönrich et al. (2012). In all cases, Binney et al. (2014) found evidence that the parallaxes of RAVE giants were underestimated by 10−15%.

In Fig. 1, we plot the distribution of the ratio between the TGAS and RAVE parallaxes for all stars satisfying the RAVE quality criteria (S/N ≥ 20 and algoConv 1) and with TGAS positive parallax. The dotted histogram corresponds to dwarf stars (log g ≥ 3.5) while the solid histogram to giants (log g< 3.5), and clearly shows the slight RAVE parallax underestimation in the same sense and with similar amplitude as reported by Binney et al. (2014). We find that the amplitude of the bias is on average 11%, as indicated by the vertical solid line. We have therefore decided to rescale the RAVE parallaxes for giant stars as , and in the rest of the paper we work with this new parallax scale. We note here that the results presented in this paper are not strongly dependent on this scaling.

thumbnail Fig. 1

Distribution of the ratio between the parallax in TGAS to that in RAVE. The vertical lines indicate the mean ratio for dwarfs (dotted) and giants (solid) and show that dwarfs’ parallaxes are consistent with one another in the datasets, but that the giants are systematically underestimated by 11% in RAVE.

Open with DEXTER

thumbnail Fig. 2

Velocities of stars in the TGAS cross-matched to the RAVE sample described in Sects. 2.1 and 2.2, selected according to the RAVE metallicity [M/H] ≤ −1.5 dex. The subset of stars classified as belonging to the halo according to a two-component Gaussian kinematic model, are plotted with open circles.

Open with DEXTER

2.1.2. Coordinate transformations

For the analysis presented below, we transform the coordinates measured for the stars (α,δ,ϖ,μα,μδ,vlos) into a Cartesian coordinate system.

We compute the distance to a star by taking the reciprocal value of its parallax. Although this approach is not quite correct when the errors on the parallax are large (see e.g. Arenou & Luri 1999; Smith 2006; Astraatmadja & Bailer-Jones 2016), Binney et al. (2014) have shown that for RAVE stars, the best distance indicator is 1/ϖ where ϖ is the maximum likelihood estimate of the parallax given by the RAVE pipeline. Furthermore, in Appendix A we explore the effect of errors when the distance is calculated by inverting the parallax using simulations based on the Gaia Universe Model Snapshot (GUMS, Robin et al. 2012). We find that the distance obtained in this way, even for relative parallax errors of approximately 30%, coincides with the true distance, when assuming the parallax error distribution is Gaussian. This analysis, together with the fact that the TGAS and RAVE parallaxes are in good agreement, gives us confidence in the methodology used.

We use the positions on the sky in Galactic (l,b) coordinates, together with the distances to obtain the Cartesian (x,y,z) coordinates. To calculate the corresponding uncertainties, we used the errors on both (l,b) and distance, as well as the covariance between l and b. To compute the velocities, we first converted the proper motions from (μα,μδ) to (μl,μb) as described in Poleski (2013), where the (α, δ) uncertainties and their covariance are all taken into account to obtain the errors on (μl,μb). Together with the line-of-sight velocity vlos, all these are used to derive (vx,vy,vz) using the transformations presented in Johnson & Soderblom (1987). For the calculations of the associated uncertainties, we have propagated the errors in distance, μl,μb, line-of-sight velocity, as well as the covariance between μl and μb. Finally, we assume the Sun is located at 8 kpc from the Galactic centre, and a Local Standard of Rest velocity VLSR = 220 km s-1 in the direction of rotation (aligned with the y-axis at the location of the Sun). We do not correct for the peculiar motion of the Sun because, for halo stars, this will only introduce a negligible offset in the velocities.

2.2. Construction of a halo sample

2.2.1. Selection criteria

The metallicities provided by the RAVE pipeline can be used to select potential halo stars. In the set of 170 509 stars for which the atmospheric parameters have been reliably determined and with relative parallax errors smaller than 0.3, we find 2013 objects with [M/H] ≤ −1.5, and with distances greater than 100 pc. The latter condition is used to reduce contamination by nearby disk stars.

We also consider another possibility, namely to select candidate halo stars based on colours using the method developed by Schlaufman & Casey (2014), also reported in Kunder et al. (2017). The combination of 2MASS and WISE colours selections 0.55 ≤ (J2MASSK2MASS) ≤ 0.85 and −0.04 ≤ (W1−W2) ≤ 0.04 turns out to be very effective. This sample also still has some amount of contamination by nearby disk red dwarfs, although significantly reduced because of the WISE colour selection. Therefore we again remove all stars within 100 pc from the Sun. This leaves us with a sample of 1912 stars.

2.2.2. Decomposition in disk and halo

No selection will lead to a completely pure halo sample, although Fig. 2 shows that the level of contamination by (thin and thick) disk stars is rather low for our RAVE-metallicity-selected sample. Disk stars can be seen to cluster around vy ~ 200 km s-1, while, on average, the halo stars have vy ~ 0 km s-1 as the figure clearly shows. Because we are interested in determining the level of kinematic and phase-space substructure in this sample, and a disk may itself be considered as a dominating (sub)structure, we proceed to flag the stars by determining the probability that they belong to the halo or to the disk (where we make no distinction between thin and thick disks). To this end, we used the sci-kit learn package in python (Pedregosa et al. 2011) and fitted a two component Gaussian Mixture Model to the Cartesian velocities of the stars, vx, vy and vz. For this fit, we considered the full velocity covariance matrix, that is, the resulting Gaussians’ principal axes are not necessarily aligned with the Cartesian coordinate system. We find a very good fit, with one Gaussian centred at vy ~ 180 km s-1 that would model the disk component(s), while the second Gaussian is centred at vy ~ 20 km s-1 and would represent the likely halo stars. Note that the mean velocity of the disk component is lower than the LSR velocity assumed, and this could partly be because our low metallicity sample has contamination predominantly from the thick, and not the thin, disk. We then flagged stars to be members of the disk or halo components according to whether they are more likely to belong to either of these two Gaussians. Of the 2013 stars in our low metallicity sample which are plotted in Fig. 2, we flagged 1010 to belong to the disk, and 1003 to be likely halo stars (open circles).

This selection scheme however, creates a discontinuity in the distribution of halo stars in velocity space in the form of a hole at the location of the disk. To avoid any unwanted effects or spurious results due to this hole, we additionally flagged a number of stars as halo candidates that were previously marked to belong to the disk. Integrating the areas spanned by the 99% confidence isocontours of the two Gaussian components, we determine that re-labelling 113 stars from the disk to the halo will effectively fill the hole in the halo velocity distribution. We thus randomly draw 113 “disk” stars following the distribution of the halo Gaussian, and re-label them as “halo” stars. Our final halo sample has therefore 1116 stars. We checked that the results of the analysis we are to present are robust by creating 1000 sub-samples of 113 re-labelled disk-to-halo stars.

We also compared the distributions in velocity (and in other projections of phase-space) of the RAVE-metallicity-selected sample to that selected using the WISE colour criteria. We found the RAVE and WISE samples to yield very similar distributions, with the WISE colour-selected sample being largely a subset of the RAVE-metallicity-selected one. However, the metallicity-selected sample has proportionally fewer stars with disk-like kinematics and so we prefer to use it in the rest of the paper.

3. Analysis

Now that we have been able to compile a good sample of halo stars, we can proceed to establish the amount of substructure present in the Milky Way’s stellar halo near the Sun. This will aid in elucidating the importance of past accretion events in the assembly of the Galaxy. On the other hand, the characterisation of the substructures found can tell us about the origin and nature of these potentially fundamental building blocks.

As discussed in the Introduction, we expect merger debris to be apparent in velocity space in the form of tight-moving groups of stars, where a single progenitor galaxy could give rise to several of these depending on its initial size (Helmi & White 1999). Therefore, in Sect. 3.1, we use a velocity correlation function to establish their presence, which should reveal power on small scales above that found in a smooth distribution.

We then turn to the space of integrals of motion, which we define in Sect. 3.2 by the stars’ energy and two components of their angular momenta. Note that energy is conserved if the gravitational potential is time-independent, while if the Milky Way were fully axisymmetric, only the z-component of the angular momentum would be an integral of motion. If these quantities were true integrals of motion, we would expect each accreted object to define a clump whose extent depends on its initial size (Helmi & de Zeeuw 2000). In reality, each of these clumps contains substructure itself (corresponding to each of the groups it produces in velocity space in a localised volume, see Fig. 5 of Gómez & Helmi 2010), but given the current observational uncertainties and the fact that the gravitational potential of the Milky Way is not very well constrained, this (sub)substructure is, in practise, not yet apparent.

Therefore, in Sects. 3.3 and 3.4, we search for the presence of clumps in the space of integrals of motion. We establish the significance of the various over-densities found by comparing them suitably randomised smooth realisations of the data.

3.1. Velocity correlation function

To quantify the presence of moving groups or streams in our dataset, we compute the velocity correlation function defined as (1)where DD is the number of pairs in the data that have a velocity difference | vivj | = Δ in a given range, Δkk + 1, and similarly RR corresponds to the number of random pairs. The velocity correlation function can therefore be used to establish if the data depicts a statistically significant excess of pairs compared to random sets, which would then indicate the presence of velocity clumping or streams.

For this statistical test, we have generated 1000 random samples whose velocity distributions follow the observed 1D distributions, but where the velocity components have been reshuffled. This ensures that we break any correlations or small scale structure present in the data in a model-independent way. In this particular case, we have scrambled the vy and vz velocities.

Figure 3 shows that the velocity correlation function reveals an excess of structure on small scales well above the signal found in the randomised sets. The first few bins indicate a statistically very significant excess of pairs. The error bars in this plot reflect the Poisson statistics uncertainties. For example, the data shows 486 pairs with velocity separations <20 km s-1 while in the random sets, the average is 404.4, implying that in the first bin only there are 82 pairs of stars in excess. The significance level is ~3.7σ assuming the uncertainty is Poissonian. Similarly, there is a very significant excess (8.8σ) of pairs for velocity separations 20 ≤ Δ < 40 km s-1, with 3264 data pairs and 2761.5 random pairs on average.

If we focus on the first velocity bin, we can identify the stars most likely to be related to the excess of pairs. We found that 112 stars appear twice in a close pair, 59 stars 3 times, 24 stars 4 times, 10 stars 5 times and one star appears in 7 tight pairs. These results mean that the signal we detect with the velocity correlation function is not due to a single stream or structure. Furthermore, we have checked that the pairs are not due to binaries by computing the average physical separation between stars in the same tight (Δ < 20 km s-1) kinematic pair. We found this distance to be on average 0.95 kpc, with the closest stars in a pair separated on the sky by ~6 deg at a distance of ~2.75 kpc, implying a physical separation of ~0.27 kpc.

In addition, we note that at large velocity differences, the correlation function seems to indicate a signal well-above that expected from the randomised sets. We shall see that this is plausibly related to an excess of stars on retrograde orbits, since such large velocity differences only can involve objects with extreme kinematics.

In the computation of the correlation function, we have not explicitly “corrected” for the effect of velocity errors. In general, we expect that these will tend to lower the significance obtained, especially in the first velocity bin, that is, the number of real pairs found with small velocity separations is likely a lower limit. In fact, the small change in slope in the correlation function seen in the first bin of Fig. 3 could potentially be related to this effect.

thumbnail Fig. 3

Amplitude of the velocity correlation function as a function of velocity separation defined as in Eq. (1). An excess of kinematic structure in our dataset compared to random (reshuffled) realisations is clearly apparent for small and for very large velocity separations.

Open with DEXTER

3.2. Distribution in integrals of motion space

For each of the stars in our halo sample, we compute their angular momentum and their energy for a potential consisting of a logarithmic halo, a Miyamoto-Nagai disk and a Hernquist bulge: Φ = Φhalo + Φdisk + Φbulge, where (2)with vhalo = 173.2 km s-1, and d = 12 kpc, (3)with Mdisk = 6.3 × 1010M, ad = 6.5 kpc and bd = 0.26 kpc, and (4)with Mbulge = 2.1 × 1010M and cb = 0.7 kpc. The numerical values of the relevant parameters in these models are chosen to provide a reasonable fit to the rotation curve of the Milky Way. In practice, the exact form of the potential is not very relevant provided it represents a fair description in the volume probed by the sample, as it acts mostly as a zero point offset from the kinetic energy.

Figure 4 shows the distribution in energy and z-angular momentum Lz on the top panel, and the versus Lz on the bottom panel for our sample of halo stars (including one random realisation of the “hole” disk stars). The disk stars occupy a very distinct region in these spaces (note the over-density at Lz ~ 1800 km s-1 kpc), with most of the halo having a much more extended distribution both in energy and Lz. The stars with high binding energies (E ≲ −1.6 × 105 km2 s-2) have little net angular momentum, but, in contrast, most of the stars with lower binding energies, E > −1.3 × 105 km2 s-2 appear to have rather large retrograde motions. This striking difference was never seen so prominently in a local sample of stars, although hints of this behaviour can be found in many published works, as we discuss below.

thumbnail Fig. 4

Distribution of energy vs. Lz (top panel), and L vs. Lz (bottom panel) for the halo metallicity selected sample obtained from the cross-match of TGAS and RAVE.

Open with DEXTER

3.3. The less-bound halo: a retrograde component

3.3.1. The reality of the retrograde component

Figure 4 shows evidence that an important fraction of the halo stars with low binding energies that visit the Sun’s vicinity (and are therefore part of our sample) are on retrograde orbits. For example, this percentage is 57.6% for E> −1.6 × 105 km2 s-2, for E> −1.3 × 105 km2 s-2 it is 72.7% and while for E> −1.2 × 105 km2 s-2 the percentage is 84.9%. At these low binding energies (lower than that of the Sun), there appear to be two main features or plumes, namely stars that are only slightly retrograde and that have Lz ~ −500 km s-1 kpc, and stars with very retrograde orbits, with Lz< −1000 km s-1 kpc.

Since there has been an important debate in the literature about the presence of net retrograde rotation in the Galactic outer halo (e.g. Carollo et al. 2007; Schönrich et al. 2011; Beers et al. 2012), it seems relevant to determine whether or not the high proportion of stars in our sample with low binding energies and with retrograde motions could be an artefact of large distance and proper motion errors. It does not appear too unlikely that errors or even a wrong value of the circular velocity of the Local Standard of Rest could slightly shift the plume with Lz ~ −500 km s-1 kpc into the prograde region. For the very retrograde stars, however, this seems to be less plausible. Nonetheless, in Fig. 5, we compare the parallaxes for TGAS and RAVE (scaled as described in Sect. 2.1) for the stars with Lz< −1000 km s-1 kpc and E> −1.6 × 105 km2 s-2. This figure reassuringly shows that the parallaxes derived from both datasets are consistent within the errors for the vast majority of the stars. We focus on the parallaxes because, even though large velocity errors can also be due to large proper motion uncertainties, in our case they are driven mostly by the parallax error, as effectively ϵ(vy) ∝ 4.74ϵ(ϖ) | μ | because of the large magnitude of the proper motion.

To more directly quantify the effect of uncertainties on the reality of the retrograde halo component, we convolved all observables with their errors, and re-computed the distribution of stars in energy and Lz space. We find that in all 1000 realisations made of the data, and for different cuts in energy (with E> −1.6 × 105 km2 s-2), there is an excess of stars with negative Lz of similar amplitude as in the data. For example, for E> −1.6 × 105 km2 s-2, the fraction of stars in this region of integrals of motion space is 57.1% ± 2.4% (compared to 57.6% in the data), while for E> −1.3 × 105 km2 s-2 it is 71.8% ± 4.2% (compared to 72.7% in the data). Therefore, we may conclude the errors alone cannot make the less bound halo become more prograde. In other words, the signal we detect is not an artefact of the errors.

thumbnail Fig. 5

Comparison of the adopted parallaxes (defined as those that have the smallest relative error when comparing the TGAS and RAVE estimates for the high quality dataset defined as in Sect. 2.1), and the TGAS parallaxes for the 33 stars with low binding energies and extremely retrograde motions.

Open with DEXTER

thumbnail Fig. 6

Kernel density estimate of the distribution of our sample of halo stars in ELz space. The stars themselves are shown as black dots. The relative peaks of the density distribution are marked with solid magenta circles.

Open with DEXTER

thumbnail Fig. 7

Left panel: average density of all 1000 randomised realisations of the data in integrals of motion space using the same density estimator as for the real data. Right panel: one of the random realisations, as an example.

Open with DEXTER

3.3.2. Significance

We may use the 1000 randomised (re-shuffled) realisations of the data introduced in Sect. 3.1 to establish the statistical significance of the retrograde component of the less-bound halo. For each realisation, we recompute the energies and angular momenta of the stars using their reshuffled velocities. We then count the fraction of bound stars nR above a given energy Emin and with Lz< 0 km s-1 kpc. We find that no realisation has as many stars as the data in this region of integrals of motion space for values of −1.6 ≤ Emin ≤ −1.2 × 105 km2 s-2. This implies that the probability of finding the observed fraction of retrograde moving stars is smaller than 1 in 1000, or 0.1%.

The average nR and standard deviation σnR of the randomised datasets (which use the same spatial distribution and 1D velocity distributions as the data) allows us to define a significance parameter, s = (nobs− ⟨ nR ⟩ ) /σnR. Depending on Emin, the minimum energy considered, typical values of s range from 4.2 to 7.7, again indicating that the excess of loosely bound stars with retrograde orbits in our sample is statistically very significant.

3.4. The more bound halo: full of structure

We perform a comprehensive analysis of our sample of halo stars in the space of integrals of motion, now looking to identify and characterise over-densities that could correspond to accreted satellites that have contributed to the build-up of the stellar halo.

3.4.1. Statistical analysis

We first focus on ELz space. We constrain our analysis to the most highly bound and populated region in the top panel of Fig. 4, selecting the stars that have −2000 ≤ Lz ≤ 2000 km s-1 kpc, and −2.1 × 105E ≤ −1.0 × 105 km2 s-2. To determine the density field of the stars in ELz space, we apply the sci-kit learn implementation of a non-parametric density estimator that uses an Epanechnikov kernel. For optimal performance of the kernel density estimator, we scaled the data to unit variance. We used the cross-validation method (e.g. Weiss & Kulikowski 1991), also implemented in sci-kit learn, to determine the optimal bandwidth for the kernel density estimator, and found it to be 0.2 in scaled units. The result of this processing is shown in Fig. 6. Since any of the easily visible over-densities in Fig. 6 could be due to stars that were once part of a single accreted object, we are interested in determining their precise location. To this end, we applied a maximum filter in order to identify the relative peaks in the underlying density distribution. We found 17 such local maxima, which are marked in Fig. 6 with solid magenta circles.

thumbnail Fig. 8

To determine the statistical significance of the over-densities we identify in Fig. 6, we bin the data in a 2D grid and count how often we observe as many or more points in the randomised realisations compared to the real data set. For clarity, only the bins with a frequency of such an occurrence of 1% or less are coloured. The left panel shows a 8 × 8 grid, while the right one shows a 16 × 16 grid.

Open with DEXTER

thumbnail Fig. 9

As in Fig. 8, but now for the 3D full integrals of motion space, we determine the statistical significance of over-densities by counting how often we observe as many or more points in the randomised realisations compared to the real data set. Only stars falling in the bins with a frequency of such an occurrence of 1% or less are coloured.

Open with DEXTER

Examining the randomised datasets, an example of which is shown in the right panel of Fig. 7, we see that their density distribution is generally also not smooth, and that several clearly distinct over-densities can often be discerned, especially in the regions of higher binding energy. Therefore, we need to investigate the probability that any of the over-densities we determined in the real data can happen by chance. To do this, we binned the data in ELz space on a series of regular grids with different bin sizes (8 × 8, 16 × 16, 16 × 16, 64 × 64). The different bin sizes are important as we want to explore how the significance of the structures varies with scale. For each bin, we counted how often we observed as many or more stars in the randomised datasets compared to the real data. We then marked those bins for which this frequency was 1% or less. Examples of this procedure are shown in Fig. 8. We find four local maxima identified in Fig. 6 to fall into bins that satisfy this criteria, for all the grids explored. When we perform a similar analysis and compare a random realisation to the remaining 999 random sets, we find that none depict probability levels as low or lower than 1%, indicating that our strict significance cut removes false positives. The fact that the over-densities identified in the data are extremely unlikely to occur by chance and that they appear independently of the grid used, makes us confident that they are indeed due to genuine substructures in the stellar halo of our Galaxy.

On the other hand, comparison of Fig. 6 and Fig. 8 reveals that several of the peaks in ELz space, particularly those located in the denser regions, do not appear to be statistically very significant according to the above analysis. To determine whether or not this could be due to a projection effect, we perform an equivalent statistical analysis but now in 3D, that is, in ELzL space. As before, we divide the space into bins of equal number in all directions. We then count the number of stars in the real data and in the randomised datasets, and identify those bins for which the frequency of finding as many or more stars in the random sets as in the real data is less than 1%. The results of this exercise are shown in Fig. 9 for the two projections of integrals of motion space for the 8 × 8 × 8 grid.

thumbnail Fig. 10

The black contours mark the extent of the substructures we have identified with the highest confidence to be real. These contours have been determined using a watershed algorithm but the member stars (indicated in this figure by the solid dots) are determined by also considering the results of the 3D Poisson analysis. For the structures in the less crowded regions of the ELz space, we set the watershed level to 0.05 of the kernel density estimate, while for the densest parts we set the level to 0.13. We find that with these values, we best trace the significant structures outlined by the density map.

Open with DEXTER

Figure 9 confirms the statistical significance of the bins identified in our 2D analysis. It also helps to better isolate the stars belonging to the structures by using the third dimension, L. On the other hand, several more regions are now identified. In fact, all these regions have a good correspondance with a subset of the structures visible in the 2D density field of the ELz space shown in Fig. 6. Of the 17 maxima we had discerned in this density field, a total of 10 appear to be associated with true statistically significant over-densities of stars in integrals of motion space.

3.4.2. Characterisation of the substructures

Having identified which of the substructures in the ELz space are significant, we use the watershed algorithm (Vincent & Soille 1991) to estimate their extent and to determine their constituent stars, as shown in Fig. 10. This algorithm works by “inverting” the terrain (in this case taking the negative of the density distribution, i.e. ρKDE), and uses the local minima (maxima in the density distribution) as sources of water, “flooding” the basins (structures) until a particular “water level” is reached, effectively determining the extents of those basins (over-densities in our case).

The 3D Poissonian analysis is particularly helpful in the densest regions of the ELz, as those are most affected by contamination, and there L is crucial to disentangle membership. We use this information to supplement the watershed algorithm and remove “interlopers”. We consider as interlopers those stars that do not fall into a significant bin in the 3D analysis, but that, in projection, are located inside a specific contour of the watershed.

thumbnail Fig. 11

Cartesian velocities for the stars comprising the identified structures. The velocities are not corrected for the effects of the Solar motion.

Open with DEXTER

thumbnail Fig. 12

Sky distribution in Galactic coordinates (l,b) of the stars in the structures identified in Sect. 3.4.2. The arrows indicate their velocities in the Galactic latitude and longitude, while the colour corresponds to their line-of-sight velocities. In the background, the stars from the full TGAS release are shown.

Open with DEXTER

In Appendix B, we have tabulated the positions and Tycho IDs of the stars, members of each of the ten substructures we identified with the above analysis. The most prominent and statistically significant of these ten substructures, located at E ~ −1.7 × 105 km2 s-2 and Lz ~ 1800 km s-1 kpc (−10 and 2.5 in scaled units, respectively), is in fact due to the disk (contamination) in our halo sample. Most of the rest are previously unknown structures and we dub them VelHel-1 to -9.

Figure 11 shows the Cartesian velocities, not corrected for the Solar motion, for the stars comprising the nine structures and for the over-density associated with the disk. The disk stars are easily recognisable, having vy ~ 220 km s-1. The rest of the substructures are clustered to different degrees in velocity space, where they sometimes define a single clump, such as, for example, VelHel-4 and VelHel-8 in vxvy space, or distinct separate clumps or streams, such as, for example, VelHel-7 in vyvz space. For such cases, it is likely that we are probing different portions of the orbits of now dispersed accreted satellites. This means that the debris is wrapped multiple times around the Galaxy in order to transverse the same volume with a different velocity.

Figure 12 shows a projection of proper motion vectors onto the sky for the stars in the different groups as well as for the stars in the very retrograde component (defined in this case as those 21 objects that have Lz ≤ −1000 km s-1 kpc and E> −1.4 × 105 km s-1 kpc). In this figure, the arrows indicate the velocity vector corresponding to the Galactic latitude and longitude directions, with components vb and vl, while the colour denotes the amplitude of the stars’ line-of-sight velocities. The background image shows the sky distribution of all TGAS stars. We note that the stars in the various groups are distributed over wide regions on the sky, and that single kinematic clumps are rarely apparent in this projection. However, several flows can often be seen, each possibly indicating a different stream from the various over-densities identified in integrals of motion space.

4. Discussion

4.1. Data caveats and the robustness of the results

We have found evidence of significant over-densities possibly associated with merger debris in a sample of halo stars that was selected on the basis of the RAVE metallicity determination. In the derivation of the phase-space coordinates of this sample, we used the parallaxes from TGAS or RAVE, depending on which had the smallest relative error. We have explored the impact of using the absolute error, instead of the relative error, as a discriminator, or of not scaling the RAVE parallaxes by 11% for the giant stars, and found our results to be robust. Our relative error tolerance of 30% may be considered relatively large, but this is necessary to have a sufficiently large sample of stars in which to identify the subtle substructures the models predict. Nonetheless, with stricter error cuts, the global kinematic properties remain similar, and the retrograde halo component, although populated by fewer stars, is still clearly visible.

To explore the effect of the uncertainties in energy and Lz on the substructures reported in Sect. 3.4, we have used the 1000 samples introduced in Sect. 3.3.1 that resulted from (re)convolving the observables with their errors. We have ran our kernel density estimator on the resulting distributions of energy and Lz with identical settings as for the real data. As one could expect, we find that errors tend to slightly blur the structures, but generally not enough to make significant changes. Exceptions are those substructures containing few stars, where the effect of Poisson statistics is significant.

Since TGAS does not contain all Tycho-2 stars, we also investigated the phase-space distribution of all the Tycho-2 stars included in the RAVE sample, and found no important differences. Again, the retrograde component is clearly apparent, and some of the other over-densities are visible as well in this more complete but less accurate dataset.

In a sample of 1116 halo stars we found ~240 stars to be part of over-densities in integrals of motion space (219 stars in the 10 structures identified in Sect. 3.4, and 21 stars that are part of the very retrograde less-bound component, defined here as Lz< −1000 km s-1 kpc and Emin> −1.4 × 105 km2 s-2). This is close to the total number of stars found in tight velocity pairs (with velocity separations smaller than 20 km s-1) according to the velocity correlation function analysis. In fact, there is a relatively good correspondance between the stars located in the over-densities in the densest part of the ELz space and those in the tight kinematic pairs.

It could be tempting to conclude, based on these numbers, that at least ~21% of the stars in the halo near the Sun have therefore been accreted. However, this conclusion would have to rely on a good understanding of the completeness of our sample. This is in fact difficult to determine at this point, both because of the selection function of TGAS, especially at the bright end, and that of RAVE. However, Wojno et al. (2016) and Kunder et al. (2017) show that there are no indications of any biases in the velocity distributions of RAVE stars. Nonetheless, the determination of the true (relative) contributions of the various over-densities to the stellar halo near the Sun remains difficult to estimate reliably, even if only because the RAVE survey has obtained data for stars only visible from the southern hemisphere. We will have to defer such analyses to future work; in particular the second Gaia data release will make this possible.

4.2. The retrograde halo: ω Cen and more

A literature search quickly demonstrates that there are independent reports from many different surveys of predominantly retrograde motions in (some portions of) the Galactic halo, indicating that this is an important (sub)component (Carney et al. 1996; Carollo et al. 2007; Nissen & Schuster 2010; Majewski et al. 2012; An et al. 2015). In our sample, we have defined as less bound stars those with energies Emin> −1.6 × 105 km2 s-2, for which we find 95 candidates, while if Emin> −1.3 × 105 km2 s-2, there are 32 objects. What is remarkable, is that nearly all nearby halo stars somewhat less bound than the Sun are on retrograde orbits.

As Fig. 4 shows, this portion of the halo seems to contain two features, stars that are only slightly retrograde, and some that have very negative angular momenta. Those that are only slightly retrograde, including the structure we identified as VelHel-1 (and possibly also VelHel-3 and VelHel-5, see also Fig. 9) could perhaps be all associated with the progenitor galaxy of ωCen (see also Fernández-Trincado et al. 2015). Dinescu (2002) built a good case for a scenario in which ωCen represents the core of an accreted nucleated dwarf, although she focused on debris on slightly tighter orbits, closer to those defined by ωCen itself (see also Brook et al. 2003). Bekki & Freeman (2003) and Tsuchiya et al. (2003) used numerical simulations to convincingly argue that the progenitor galaxy must have been a lot more massive if it had to sink in via dynamical friction on a retrograde orbit. Such models predict a trail of debris with a range of binding energies possibly resembling the elongated feature we identified in our analyses.

Although the prominence of ωCen has been put forward before, we have found that it is not the only important contributor to the retrograde portion of the halo. In this paper we have shown the striking dominance of halo stars on less-bound retrograde orbits crossing the Sun’s vicinity. Depending on the minimum energy considered, between ~58% and 73% of these stars have motions that are significantly retrograde. Such a high fraction appears to be intuitively somewhat surprising, and we turn to cosmological simulations to establish how likely this is.

We use the Illustris cosmological hydrodynamical simulation (Vogelsberger et al. 2014a,b; Nelson et al. 2015) to quantify the likelihood of finding galaxies with a given fraction of their halo stars on counter-rotating orbits. To this end, we select all central galaxies within the virial mass range M200 = 0.8−2.0 × 1012M, which encompasses current estimates for mass of the Milky Way. We rotate each system such that the total angular momentum of the galaxy (defined by all the stars within rgal = 0.15r200) points along the z-axis. In this rotated system, we compute the fraction of stars outside a given radial distance (r> 10,15,20 kpc) that have negative angular momentum in the z-direction, that is, that have retrograde orbits. The results are shown in Fig. 13, which plots the cumulative fraction of galaxies in our sample where the counter rotating stellar halo accounts for more than a given fraction of the total stellar mass outside the given radius. We find that ~1% of the galaxies with Milky Way-like mass have a halo with more than 60% of the stars being retrograde outside r = 15 kpc, and that each half the sample has a counter-rotating fraction equal or larger than ~30%. For comparison, the values inferred from our halo sample are shown with the grey shaded region. Such a large fraction of retrograde halo stars thus seems rather unusual according to these cosmological simulations, but is certainly not impossible. In addition, it should be borne in mind that our sample is not complete, and that the fractions we have found could have been overestimated because of selection effects.

thumbnail Fig. 13

Cumulative fraction of Milky Way-mass galaxies in the Illustris simulation, containing a given fraction of retrograde halo stars beyond a radial distance of 10, 15 and 20 kpc as indicated in the inset. The grey region denotes our estimated fraction of less-bound halo stars with retrograde motions in the real data.

Open with DEXTER

Furthermore, because of numerical resolution limitations in the simulations, we based our comparison on stellar particles located beyond a given radius, while our observational selection is based on an energy threshold for stars crossing the Solar neighbourhood now. Therefore these conclusions should be taken as intriguing and suggestive, and will need to be confirmed once the selection function of our sample is better known, and should be contrasted to higher resolution cosmological simulations.

4.3. Comparison to cosmological simulations: granularity

Another interesting point is to establish whether the level of velocity granularity that we find in our dataset is consistent with cosmological simulations of the formation of stellar halos. Because we are not concerned with a specific sense of rotation with respect to a major Galactic component, we may use the stellar halos that result from coupling a semi-analytic galaxy formation model to the Aquarius dark-matter-only simulations (Cooper et al. 2010), after applying a suitable tagging and resampling scheme to the dark-matter particles (Lowing et al. 2015). This suite of simulations therefore only models the accreted component of a halo but has much higher resolution than the Illustris cosmological hydrodynamical simulations, for example. The resulting Aquarius stellar halos have a range of stellar masses from 3.8 to 18.5 × 108M. Therefore, to make a more direct comparison to our own Galactic halo, we have scaled the simulated stellar halos to have a stellar mass of 2.64 × 108M, that is, that estimated for the Milky Way by Robin et al. (2003). We do this by down-sampling the number of stars per population (e.g. main sequence, red giant branch, etc) by the corresponding factors.

For each of the halos, we placed a small sphere of 2.8 kpc radius at 8 kpc from the centre and randomly selected 1116 objects amongst the simulated stars with magnitudes in the range 8 ≤ G ≤ 12.5. We applied the appropriate photometric band transformations (Jordi et al. 2010), as the Lowing et al. (2015) dataset provides the stellar magnitudes in the SDSS filters. We then computed the velocity correlation function as we have done for our observed dataset, and compared them to random (reshuffled) distributions, again as for the real data.

In Fig. 14, we show the resulting velocity correlation functions. Interestingly, the amplitude of the signal in the simulated halos is very close to the signal we find in our sample of halo stars, although there is some scatter from simulation to simulation. We also note that the range of velocities over which we make the comparison is different to that used for the data and this is because the Aquarius halos have a smaller velocity dispersion as they do not have a disk component. This means that the maximum velocity separation is bound to be smaller. Nonetheless, the excess of pairs with large velocity separation is similar in amplitude and shape to what we found in the data.

Taken at face value this comparison implies that the amount of substructure present in our sample of Milky Way halo stars is consistent to that expected for halos built solely via accretion. In order to see significant differences, and to robustly determine the true relative fractions of accreted versus in-situ halo stars would require a much larger sample. Fortunately, such a sample will become available with the second and subsequent data releases from the Gaia mission.

thumbnail Fig. 14

Velocity correlation function for a sample of “stars” extracted from the stellar halos in the Aquarius simulations (Lowing et al. 2015). The stars are located in a sphere of 2.8 kpc and have magnitudes in the range 8 ≤ G ≤ 12.5. The sample size is the same as our observed halo dataset. A comparison to Fig. 3 shows a similar excess of pairs in the simulations as in the data, although there is some variance in the simulations.

Open with DEXTER

4.4. New and old substructures

We now review our knowledge of substructures in the space of integrals of motion. Firstly, it is important to note that the fact that the region occupied by disk(s) appears as such a prominent over-density in our analyses would suggest that the disks have a rather extended metal-poor tail (see e.g. Kordopatis et al. 2013b). We have already discussed the possible relation of structures VelHel-3, VelHel-1 and VelHel-5 to ωCen debris. Perhaps VelHel-2, VelHel-8 and VelHel-9 are also associated to this, although they could also be independent structures themselves. On the other hand, structures VelHel-6 and VelHel-7 were previously unknown, as well as VelHel-4. This particular clump has very interesting kinematics as it is on a low inclination orbit and rotates almost as fast as the disk, but in the opposite direction.

All these substructures are new and have not been reported in the literature to the best of our knowledge. We have quickly inspected their distribution in [α/Fe] versus [Fe/H] as determined by the RAVE survey pipeline and find them fairly undistinguishable from canonical halo stars, and without any particular degree of clustering in this chemical abundance space.

In comparison to other reports of substructure, such as those based on the SDSS survey by Smith et al. (2009), Re Fiorentin et al. (2015), for example, we do not find overlap. Also noticeable is the absence of an over-density of stars associated with the Helmi et al. (1999) streams, although there is a small kinematic group in our sample at the expected location, but which is not picked up as statistically significant by our analysis. We inspected the distribution of these stars and found them to be preferentially located at high Galactic latitude. The footprint of the RAVE survey thus seems to hamper the discovery of more members, whose presence has been confirmed for example in the SDSS survey, which does cover the north Galactic pole region well. Similarly, the over-densities reported by Smith et al. (2009), Re Fiorentin et al. (2015) do not appear in our sample, and these correspond to structures with rather high L. Stars with large L for a given Lz have high inclination orbits, and therefore move fast in the vertical direction when crossing the Solar neighbourhood. They would be more easily detected when observing towards the Galactic poles, but the RAVE survey penalises these regions, although in the case of the south Galactic pole, it is an issue of completeness with magnitude (Wojno et al. 2016). Selection effects such as these are clearly important when estimating the fraction of the halo that is in substructures, and require careful attention when making quantitative assessments.

5. Conclusions

We have constructed a sample of stars based on the cross match of the recent first Gaia data release known as TGAS to the RAVE spectroscopic survey, and identified a subset as halo stars based on their RAVE metallicity. We analysed their kinematics and demonstrated, with a velocity correlation function, the existence of a significant excess of pairs of stars with small velocity differences compared to the number expected for random distributions obtained by re-shuffling the various velocity components of the stars in our sample. The statistical significance of the signal is 3.7σ for separations smaller than 20 km s-1, and 8.8σ for 40 km s-1 separations.

We then determined the distribution of our sample of halo stars in integrals of motion space, defined by two components of the angular momentum, L and Lz and by the energy of the stars, which was computed for a reasonable estimate of the Galactic potential. The distribution of stars in this space is complex and shows a high degree of structure. Firstly, we established the presence of a dominant retrograde component for stars somewhat less bound than the Sun. The probability that ~60% or more of the stars with such orbital characteristics occurs by chance as estimated from the randomised smooth sets is smaller than 1/1000. For the more bound halo stars, we identify at least ten statistically significant over-densities whose probability, derived using the randomised sets, is smaller than 1%. Of these over-densities, the most prominent appears to be associated with the metal-poor tail of the disk(s).

We have also performed comparisons to cosmological simulations. The level of substructure revealed by the velocity correlation function for our sample is comparable to that found in Solar neighbourhood-like volumes with similar numbers of stars in the stellar halos of the Aquarius simulations by Lowing et al. (2015). This indicates that it is plausible that the whole Galactic stellar halo was built via accretion, as this is the only channel considered in these simulations. We have also established the frequency of occurrence of retrograde outer halos with similar predominance as estimated from our halo sample. In the Illustris simulations (Vogelsberger et al. 2014a), less than 1% of the Milky Way mass galaxies have outer halos where more than ~60% of the stars have retrograde motions. At face value, this is very intriguing. However, this comparison suffers from several issues, such as incompleteness in the observational sample and numerical resolution limitations of the simulations. Nonetheless, the predominance of the retrograde halo is striking in itself, and has been found perhaps less dramatically by several studies independently, indicating that it is likely a major component of the Galactic halo.

This first analysis of data obtained by the Gaia satellite mission has thus revealed many exciting results. We look forward to better understanding what our findings imply from the dynamical and chemical perspectives for the history of the Milky Way. There is plenty of work to do before the second Gaia data release becomes available.

Acknowledgments

It is a pleasure to thank Lorenzo Posti and Anthony Brown for the numerous enlightening discussions. We are grateful to the referee for a prompt and constructive report. A.H. acknowledges financial support from a VICI grant from the Netherlands Organisation for Scientific Research, N.W.O. M.B., J.V. and A.H. are grateful to NOVA for financial support. This work has made use of data from the European Space Agency (ESA) mission Gaia (http://www.cosmos.esa.int/gaia), processed by the Gaia Data Processing and Analysis Consortium (DPAC, http://www.cosmos.esa.int/web/gaia/dpac/consortium). Funding for the DPAC has been provided by national institutions, in particular the institutions participating in the Gaia Multilateral Agreement.

References

Appendix A: Estimating the bias on the distance obtained by inverting the trigonometric parallax

In the ideal case of error-free measurements, the distance to a star is simply the inverse of its measured trigonometric parallax. In reality, however, measurements do have errors, and this makes the calculation of the distance significantly more involved. There has been some debate in the literature over what is the best way to obtain reliable distances to stars using their parallaxes and what biases one can expect when inverting the parallax to obtain a distance (e.g. Arenou & Luri 1999; Smith 2006; Astraatmadja & Bailer-Jones 2016).

For the purpose of the analysis presented in this paper, we use distances obtained by inverting the parallaxes both from RAVE and from the TGAS datasets. For the RAVE stars, as discussed earlier, Binney et al. (2014) has shown that the best distance estimate is obtained by inverting the parallax estimated by the RAVE pipeline. Our focus in this Appendix is therefore on those halo stars in our sample for which the TGAS parallax is more precise than the RAVE parallax. These stars are all located within 1.2 kpc from the Sun, as shown in Fig. A.1.

In order to quantify the bias we may introduce by inverting the TGAS parallaxes, we have performed the following test. Using the Gaia Universe Model Snapshot (GUMS, Robin et al. 2012) we select all halo stars in the model within 3 kpc from the Sun that are within the TGAS magnitude limits. We consider a larger distance range to explore the issue of inverting trigonometric parallaxes more broadly. GUMS here represents a perfect model, and all stellar quantities available are error-free, meaning one can convert from distance to parallax by taking its inverse. We then convolve these true parallaxes with the errors of the halo stars in the sample we defined in Sect. 2.2, and assuming the errors are Gaussian. The convolution is repeated 1000 times so that we have 1000 different realisations of the same GUMS stars. Finally we calculated the “measured” distances by inverting the “individual measurements” of the parallaxes.

thumbnail Fig. A.1

Distribution of distances obtained by inverting the trigonometric parallaxes for our sample of halo stars and for which the TGAS parallax is more precise than the RAVE parallax estimate.

Open with DEXTER

Figure A.2 shows that there is virtually no difference between the true distance for the stars in the GUMS dataset, and the mean distance obtained by inverting the “mean” parallax of all 1000 error convolved sets. In fact, the average absolute difference between these two quantities is smaller than 10 pc. The left panels of Fig. A.2 show the true distances of GUMS stars versus one set of the error-convolved distances as black dots. The top panel corresponds to the distance range probed by the TGAS stars in our sample, while the bottom panel is for distances up to 3 kpc. In both panels, the blue points represent the mean values obtained by inverting the mean parallax of all 1000 convolved sets, and these lie perfectly on the 1:1 relation. The green points correspond to the mean obtained when applying a 30% cut to the relative parallax error for each realisation.

The bottom left panel in Fig. A.2, especially, shows that the black dots (which represent distances obtained from a single realisation of the error convolved parallaxes) are not evenly scattered around the 1:1 line. This tells us that, when convolving true parallaxes with errors and then inverting them to get the distance, the most likely value obtained for the distance often underestimates the true distance. On the other hand, the distance can be scattered more towards larger values than towards smaller ones.

This effect is clearly shown in the right-hand side panels of Fig. A.2. The distance distribution of a single star for all 1000 error convolved sets is shown in the top for a TGAS star at a true distance of 0.75 kpc, and for a star farther away at 1.9 kpc in the bottom panel. In both cases, the solid and dashed vertical lines mark the true distance and distance obtained by inverting the mean of the parallax distribution, respectively, of the considered star, without (blue) and with (yellow) a relative parallax error cut. Since the TGAS halo stars in our sample are located closer than ~1.2 kpc, we can conclude that the effect is negligible.

More generally, if we can assume that the errors on the parallax are Gaussian, then by inverting the mean parallax to obtain a distance, it is possible to recover the true distance well. In the case of Gaussian errors, the mean and the maximum likelihood estimator of the parallax coincide. We note that it would not seem to be wise under these circumstances to attempt to derive a distance estimate from the inversion of each of the individual trigonometric parallaxes, since the maximum likelihood estimator of the distances obtained is different from the mean of their distribution. Since the parallax error distribution has not been characterised yet for the TGAS sample, we are forced to make the simple assumption of Gaussian errors. Future data releases will allow us to establish more robustly if there are biases that need to be considered.

thumbnail Fig. A.2

Left panels: true distances vs. a (random) set of error-convolved distances for a sample of halo stars from the Gaia Universe Model Snapshot. The top panels cover the distance range for the stars in our halo sample with parallaxes from TGAS, while the bottom panels do so for the full halo sample. The blue points in these panels represent the inverted average value of the parallaxes obtained from all 1000 convolved sets for each star, and they lie almost perfectly on the 1:1 line, while the green points are the averages obtained using only those realisations for which the relative parallax error is smaller than 30%. Right panels: distribution of distances from the 1000 error convolved parallaxes for a single star, at a distance of 0.75 kpc representative of a TGAS star (top), and at a distance of 1.9 kpc for a more distant object (bottom). One can see that the true distance to the star (solid vertical line) matches the distance obtained by inverting the parallax (dashed vertical line) extremely well, and this, in fact, coincides well with the mean of the distance distribution.

Open with DEXTER

Appendix B: Stars members of the newly identified substructures

Table B.1

Positions and Tycho ids for the stars belonging to a substructure identified as the disk in Fig. 10.

Table B.2

Positions and Tycho ids for the stars belonging to VelHel-1.

Table B.3

Positions and Tycho ids for the stars belonging to VelHel-2.

Table B.4

Positions and Tycho ids for the stars belonging to the substructures VelHel-3.

Table B.5

Positions and Tycho ids for the stars belonging to VelHel-4.

Table B.6

Positions and Tycho ids for the stars belonging to VelHel-5.

Table B.7

Positions and Tycho ids for the stars belonging to VelHel-6.

Table B.8

Positions and Tycho ids for the stars belonging to VelHel-7.

Table B.9

Positions and Tycho ids for the stars belonging to VelHel-8.

Table B.10

Positions and Tycho ids for the stars belonging to VelHel-9.

All Tables

Table B.1

Positions and Tycho ids for the stars belonging to a substructure identified as the disk in Fig. 10.

Table B.2

Positions and Tycho ids for the stars belonging to VelHel-1.

Table B.3

Positions and Tycho ids for the stars belonging to VelHel-2.

Table B.4

Positions and Tycho ids for the stars belonging to the substructures VelHel-3.

Table B.5

Positions and Tycho ids for the stars belonging to VelHel-4.

Table B.6

Positions and Tycho ids for the stars belonging to VelHel-5.

Table B.7

Positions and Tycho ids for the stars belonging to VelHel-6.

Table B.8

Positions and Tycho ids for the stars belonging to VelHel-7.

Table B.9

Positions and Tycho ids for the stars belonging to VelHel-8.

Table B.10

Positions and Tycho ids for the stars belonging to VelHel-9.

All Figures

thumbnail Fig. 1

Distribution of the ratio between the parallax in TGAS to that in RAVE. The vertical lines indicate the mean ratio for dwarfs (dotted) and giants (solid) and show that dwarfs’ parallaxes are consistent with one another in the datasets, but that the giants are systematically underestimated by 11% in RAVE.

Open with DEXTER
In the text
thumbnail Fig. 2

Velocities of stars in the TGAS cross-matched to the RAVE sample described in Sects. 2.1 and 2.2, selected according to the RAVE metallicity [M/H] ≤ −1.5 dex. The subset of stars classified as belonging to the halo according to a two-component Gaussian kinematic model, are plotted with open circles.

Open with DEXTER
In the text
thumbnail Fig. 3

Amplitude of the velocity correlation function as a function of velocity separation defined as in Eq. (1). An excess of kinematic structure in our dataset compared to random (reshuffled) realisations is clearly apparent for small and for very large velocity separations.

Open with DEXTER
In the text
thumbnail Fig. 4

Distribution of energy vs. Lz (top panel), and L vs. Lz (bottom panel) for the halo metallicity selected sample obtained from the cross-match of TGAS and RAVE.

Open with DEXTER
In the text
thumbnail Fig. 5

Comparison of the adopted parallaxes (defined as those that have the smallest relative error when comparing the TGAS and RAVE estimates for the high quality dataset defined as in Sect. 2.1), and the TGAS parallaxes for the 33 stars with low binding energies and extremely retrograde motions.

Open with DEXTER
In the text
thumbnail Fig. 6

Kernel density estimate of the distribution of our sample of halo stars in ELz space. The stars themselves are shown as black dots. The relative peaks of the density distribution are marked with solid magenta circles.

Open with DEXTER
In the text
thumbnail Fig. 7

Left panel: average density of all 1000 randomised realisations of the data in integrals of motion space using the same density estimator as for the real data. Right panel: one of the random realisations, as an example.

Open with DEXTER
In the text
thumbnail Fig. 8

To determine the statistical significance of the over-densities we identify in Fig. 6, we bin the data in a 2D grid and count how often we observe as many or more points in the randomised realisations compared to the real data set. For clarity, only the bins with a frequency of such an occurrence of 1% or less are coloured. The left panel shows a 8 × 8 grid, while the right one shows a 16 × 16 grid.

Open with DEXTER
In the text
thumbnail Fig. 9

As in Fig. 8, but now for the 3D full integrals of motion space, we determine the statistical significance of over-densities by counting how often we observe as many or more points in the randomised realisations compared to the real data set. Only stars falling in the bins with a frequency of such an occurrence of 1% or less are coloured.

Open with DEXTER
In the text
thumbnail Fig. 10

The black contours mark the extent of the substructures we have identified with the highest confidence to be real. These contours have been determined using a watershed algorithm but the member stars (indicated in this figure by the solid dots) are determined by also considering the results of the 3D Poisson analysis. For the structures in the less crowded regions of the ELz space, we set the watershed level to 0.05 of the kernel density estimate, while for the densest parts we set the level to 0.13. We find that with these values, we best trace the significant structures outlined by the density map.

Open with DEXTER
In the text
thumbnail Fig. 11

Cartesian velocities for the stars comprising the identified structures. The velocities are not corrected for the effects of the Solar motion.

Open with DEXTER
In the text
thumbnail Fig. 12

Sky distribution in Galactic coordinates (l,b) of the stars in the structures identified in Sect. 3.4.2. The arrows indicate their velocities in the Galactic latitude and longitude, while the colour corresponds to their line-of-sight velocities. In the background, the stars from the full TGAS release are shown.

Open with DEXTER
In the text
thumbnail Fig. 13

Cumulative fraction of Milky Way-mass galaxies in the Illustris simulation, containing a given fraction of retrograde halo stars beyond a radial distance of 10, 15 and 20 kpc as indicated in the inset. The grey region denotes our estimated fraction of less-bound halo stars with retrograde motions in the real data.

Open with DEXTER
In the text
thumbnail Fig. 14

Velocity correlation function for a sample of “stars” extracted from the stellar halos in the Aquarius simulations (Lowing et al. 2015). The stars are located in a sphere of 2.8 kpc and have magnitudes in the range 8 ≤ G ≤ 12.5. The sample size is the same as our observed halo dataset. A comparison to Fig. 3 shows a similar excess of pairs in the simulations as in the data, although there is some variance in the simulations.

Open with DEXTER
In the text
thumbnail Fig. A.1

Distribution of distances obtained by inverting the trigonometric parallaxes for our sample of halo stars and for which the TGAS parallax is more precise than the RAVE parallax estimate.

Open with DEXTER
In the text
thumbnail Fig. A.2

Left panels: true distances vs. a (random) set of error-convolved distances for a sample of halo stars from the Gaia Universe Model Snapshot. The top panels cover the distance range for the stars in our halo sample with parallaxes from TGAS, while the bottom panels do so for the full halo sample. The blue points in these panels represent the inverted average value of the parallaxes obtained from all 1000 convolved sets for each star, and they lie almost perfectly on the 1:1 line, while the green points are the averages obtained using only those realisations for which the relative parallax error is smaller than 30%. Right panels: distribution of distances from the 1000 error convolved parallaxes for a single star, at a distance of 0.75 kpc representative of a TGAS star (top), and at a distance of 1.9 kpc for a more distant object (bottom). One can see that the true distance to the star (solid vertical line) matches the distance obtained by inverting the parallax (dashed vertical line) extremely well, and this, in fact, coincides well with the mean of the distance distribution.

Open with DEXTER
In the text

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.