Friday, June 18, 2010

Likelihood fitting of distance modulus




I have begun writing a Matlab code to compute the distance modulus, reddening, and age of a cluster through Bayesian statistics. As a first step, I am keeping the proper age and fitting only for distance modulus with cluster NGC 752. For a range of distances around the proper value, I computed the tau squared of the fit, the -2 sum of the logs of the probabilities. Right now I am doing this with two for loops, one for each distance and one for each star, but when I get the code better I wish to use only matrix manipulation, since Matlab is optimized for working with matrices.

After correcting a few errors, I did manage to get a plot which looks right and gives back an answer somewhere in the ball park. I get a distance modulus of 7.36. The actual distance modulus is 8.18 and I can see just looking at the graph that 7.36 is too low. This may be occuring from a line of stars to the right of the main sequence and below the red giant area. Moving downward gets them, out in the middle of no where, closer to the sequence.

My next steps involve getting rid of the for loops, figuring out exactly why the fit is bad, and then working in fitting for age as well.

Thursday, June 17, 2010

PDFs, CDFs, and two new clusters





I have added two new clusters to my list, NGC 1664 and NGC 6940, with an assortment of stars in each. I found a cluster that is potentially quite usable, but no one else had an age estimate, so I will work with that after constructing my CMD fitting procedure.

Even though I will fit the masses properly after I have derived my own properties for the clusters, I wrote the program to compute the probability distribution function (PDF) and cumulative distribution function (CDF) for a star given an isochrone. It involves calculating the probability of a given location in the star's Gaussian distribution on the isochrone. For each part of the isochrone, I break it into a certain distance away in the sigma v direction (vertical) and sigma b direction (slope 1) from the star. Each distance is then divided by the respective standard deviations and I take the square root of the sum of the squares. This is the (data - fit)/sigma that goes into the Gaussian equation.
The CDF is merely the integral of the PDF. The PDF is normalized.

The hardest part of this process is actually tracking down the original source of the UBV values to arrive at the error of U and V. Some of the sources are old or have moved around, making it either difficult or impossible to find, so I only managed to compute this for two stars.

A Maximum Likelihood Method for Fitting Color Magnitude Diagrams

In their paper, Naylor and Jeffries presented a method of fitting clusters to an isochrone in a color- magnitude diagram using Bayesian statistics in such a way as to have numerical backing for a fit and to get useful error bounds. For each star, they multiply the star’s Gaussian distribution together with the density of the isochrone, then integrating over all space to obtain the probability that the star came from that isochrone. Tau squared, a very similar quantity to chi squared, is negative two times the sum of the logarithms of the probabilities. A good isochrone fit minimizes tau squared. Minimizing tau squared is the same as maximizing the likelihood that the data came from the given isochrone, since tau squared is a modification of the log of the likelihood. The likelihood of a fit given the data is proportional to the likelihood of the data given the fit by Bayes’s Theorem, so maximizing one is maximizing the other.

Past this theory, several additional factors enter into actual computation. For example, one standardizes the number of stars per magnitude range so that the initial mass function is not a parameter in the fit. Error bounds were found by a bootstrap method when the cluster was simulated (a set of different inputs according to a Gaussian were inputted and the result examined), and integrating over tau squared space gave error bounds in a real cluster. To account for the interaction between B error and B-V error, the color magnitude diagram must be translated into a magnitude-magnitude diagram to compute the probabilities, yet this only happens for the one step.

They worked this method of fitting both on simulated data sets and an actual cluster, and both agreed with the back of the book answer.

There are still multiple things I do not understand in the paper, but they are more details of implementation and terminology. How is the normalization over magnitude ranges accomplished, and why does that not give unfair wait to higher magnitudes? What is the theta and gradient discussed on the seventh page? Though the paper says the method accounts for binary systems, how does it do that?

Tuesday, June 15, 2010

4 Clusters Choosen

So I have now chosen four clusters and approximately three stars in each cluster. These clusters are M37, NGC 752, Collinder 350, and Melotte 111 (Coma Berenices). I choose them based on age (old enough to have red giants), quality of UBV data, location in the sky, and not already being in a planet survey.

Starting out, I searched through WEBDA's list to find open clusters with positive declination visible during the summer from Hawaii and having log age > 8.5 . I found that clusters with log age less than 8.5 were usually too young to have red giants.

I plotted the B-V vrs. V for each cluster and used published parameters (either on WEBDA or in papers) to get an approximate fit for each cluster. If things did not work at all or the UBV data was scarce and not clear, I would pass the cluster over.


Once I thought a cluster passed those tests, I choose individual stars to get a rough mass estimate for. Since this is a first rough test, I merely computed which point on the theoretical isochrone was closest to the star distance wise. When I go through after getting my own cluster parameters, I will compute the PDF, by analyzing B-V and V separately with their own standard deviations (not just as distance) and assuming Gaussian distributions. That way I will actually be able to get error bounds as well.

Even though I wanted to get an even mass range, I ended up getting a lot of 2.7 mass stars in M37, Collinder 350, and Melotte 111 with 1.7 mass stars in NGC 752. I need older clusters to get lower mass stars, but most clusters, other than NGC 752 and a few others, have dissipated by 1 Gyr of age.

As a continuation of yesterdays post, it was mostly the one star I was focused on that wasn't working. That particular star still doesn't work, but other stars have been looked up in both SINBAD and Hipparcos. Some stars still do not show in the catalogs, but for several I was able to get a more accurate location, official names, and variability information. Some of the 12 stars I had initially chosen now have to get thrown out for either being spectroscopic binaries or microscopically variable.

Monday, June 14, 2010

Accessing data in different catalogs

I have been studying M37 with the data given by the WEBDA database. Now I am trying to access information on the stars from alternate locations but am having a very difficult time trying to locate the same stars. In a paper I read, they accessed the star's data in the Hipparcos database to determine whether or not the star was variable in the visual range. I wish to make sure the stars I am singling out are not variable, so I am trying to look them up in Hipparcos.

I am working with NGC 2099 star no 195 with RA(2000)=05 52 16.25 and Dec(2000)=+32 36 20.7
When entering the RA and Dec coordinates of the star I get "No data returned!" from Hipparcos.
When searching for the star with SINBAD, I cannot get the same star. I get a similar star given by Cl* NGC2099 J136 which also has the designation NGC2099 195. The 195 similarity suggests they are the same star, but WEBDA gives it a V magnitude of 11.009 and SINDAB gives it 11.27. Also SINBAD has a B-V of .5 where WEBDA has a B-V of .76.
So I am stuck.

Saturday, June 12, 2010

A Planet Around the Evolved Intermediate Star HD110014


by Medeiros, Setiawan, Hatzes, Pasquini, Girardi, Udry, Dollinger, and da Silva
Astronomy and Astrophysics 2009

Medieros et. all discovered a 835.477±6.04 day period, Msin i = 11.1(9.5) MJup planet around HD110014. Relevant to my research, HD110014 is an evolved higher mass star, but is not in a cluster. The bracketed value after the quoted mass is a result from not knowing the mass of the sun. After computing the PDF for the mass of the main star by summing over all possible parts of all possible isochrones it could belong to, there is roughly equal probabilities of it being a 1.55 Gyr 1.9Msun star, or a 0.68Gyr 2.5 Msun star. A slightly higher C12/C13 ratio gives weight to the 2.5 Msun solution, but both still exist. The 9.5Mjup solution for the mass arises under treating the star as a 1.9Msun star instead of a 2.5Msun star.

After removing the planet, a 130 day periodicity remains. This may be a result of a planet, but the evidence is not strong enough to convincingly say so. Treating and it as a planet and removing from the residuals eliminates all remaining periodicity in the signal.
In order to determine that the 835 period is not the result of stellar phenomena, they ran several tests. I need to read other papers specifically on the bisector velocity span and CaII H&K emission lines in order to understand them better. They also studied the visual magnitude that came from the Hipparcos data. It was classified as a non-periodic star, and no significant peak appeared in the periodogram at 835 days.

Plotting Theoretical Isochrones and Cluster Data


In order figure out the mass of the stars, I have to fit it to theoretical curves for a range of stellar masses at the same age and metallicity as the cluster.

The website CMD computes all the relevant data for a given age and metallicity, and can also factor in extinction, circumstellar dusk, and returns a file filled with data points over a range of masses. In this file are the Absolute Blue B and Visual V Magnitudes.

The WEBDA database has data on each star for a given open cluster. In this I can get the relative B and V magnitudes.

So as just a rough start, I tried plotting (B-V versus V)several clusters (translated to absolute magnitudes by compensating for distance) and roughly same age isochrones in Matlab. I plotted -V actually instead of V so that the graph would look right side up. It didn't work.

I fiddled around with it for a while to try and figure out what was wrong. The sun's data point fell into the cluster data, so I knew the isochrones were wrong. Finally I discovered that when importing the data files, Matlab stuck a column on Nan in front of the rest of the data. I was thinking B and V were in 9 and 10 respectively, but they were actually shifted and in 10 and 11th columns.
NOW IT WORKS YAY!!!!!!