Innovative Algorithms for Astrophysics

Designer Algorithms for Astronomy

Large archives of astronomical data (images, spectra and catalogues) are being assembled into a database that will soon be accessible worldwide as part of the Virtual Observatory. This necessitates the development of techniques that will allow fast, automated classification and extraction of key physical properties for very large datasets, and the ability to visualise the structure of highly multi-dimensional data, and extract and study substructures in a flexible way. This collaboration between the School of Physics and Astronomy and the School of Computer Science at the University of Birmingham aims to adapt, for the use of astronomical data mining, a number of innovative algorithms for data analysis including genetic programming and evolutionary computation, latent variable analysis, computer vision, machine learning networks etc.

The effort is coordinated by Somak Raychaudhury, and involves both astronomers (Trevor Ponman, Louisa Nolan, Ian Stevens, Bill Chaplin, Habib Khosroshahi) and Computer Scientists (Ela Claridge, Ata Kaban, Alan Sexton, Peter Tino, and Xin Yao), at Birmingham. This activity recently led to a E-science award from PPARC, as a result of which  Jianyong Sun has been working as a postdoctoral fellow in this field.

  • Latent variable and Bayesian modelling- the use of Kernel methods and Support Vector machines:  Measuring time-delays between multiple images in a gravitational lens, using long-term monitoring with high-resolution radio or optical images, is difficult in the presence of correlated noise on various scales. We are investigating various methods of doing this. Bayesian methods developed with  Markus Harva (Helsinki) have met with some success, and a kernel-fitting approach using latent variables is the subject of a PhD thesis (Juan Cuevas Tello), jointly supervised by Dr Peter Tino (Computer Science) and Somak Raychaudhury (ASR group). In principle, such modelling could lead us to measure time-delayed signals from unresolved images. This method has much wider application, which will be our next goal, in applying it to automatically finding redshifts from millions of galaxy spectra and quantifying the spectral widths of emission and absorption lines.

  • Independent Component analysis of galaxy spectra: Elliptical galaxies were once believed to consist of a single population of old stars formed coevally at high redshift, followed by predominantly passive evolution. However, more recent hierarchical structure formation models suggest that they are formed from the low redshift merging of disk galaxies, with associated significant star formation, and recent analyses of galaxy spectra seem to indicate the presence of significant younger populations of stars in at least some elliptical galaxies. The detailed physical modelling of such populations via spectral fitting, is computationally expensive, inhibiting the detailed analysis of the several million galaxy spectra which will  become available over the next few years. A collaboration between Ata Kaban, Markus Harva, Louisa Nolan and Somak Raychaudhury has developed a data-driven application aimed at decomposing the spectra of galaxies into that of several stellar populations, without the use of detailed physical models. This method includes a Bayesian way of filling in missing data in an ensemble of spectra, and the interpretation of the independent components is terms of old and young stellar populations has already yielded spectacular results.

  • Hierarchical visualization of high dimensional data: Jianyong Sun, Ata Kaban, Peter Tino, Somak Raychaudhury (more to come)

  • Inversion techniques for spectral mapping: An inversion technique for the recovery of physical parameters from multi-colour images, already successfully applied in medical imaging,  has been applied to X-ray images,  extracted in a set of optimal energy bands, to map the spectral properties of hot gas in clusters of galaxies. This effort, resulting from a collaboration between Ela Claridge, Mark O'Dwyer (CS), Trevor Ponman and Somak Raychaudhury, will facilitate extensive statistical studies of physical properties of galaxies and clusters from large X-ray archives without detailed model-fitting.

  • Genetic algorithms for model discovery: As data improve, the analytical forms traditionally used to model galaxies and their clusters prove to be inadequate, where departures from such simple forms may contain important information on structure and evolution. To provide a more flexible and sophisticated suite of models, developed by Jin Li, Xin Yao (CS), Habib Khosroshahi and Somak Raychaudhury, we are examining the use of genetic algorithms, which allow models themselves to evolve, in a fashion modelled on biological evolution, to fit photometric observations of both galaxies and clusters.

Publications in this field

Refereed Journals

  1. Finding young stellar populations in early-type galaxies from an independent component analysis of their UV-optical spectra
    Nolan Louisa A., Harva Markus O., Kaban Ata and Raychaudhury Somak, 2006, MNRAS, 366, 321-338.
  2. How accurate are the time delay estimates in gravitational lensing?
    Juan C. Cuevas-Tello, Peter Tino& Somak Raychaudhury 2006, Astron. Astrophys., 454, 695-706
  3. Young stellar populations in early-type galaxies in the Sloan Digital Sky Survey
    Nolan Louisa A., Raychaudhury Somak and Kaban Ata, 2006, MNRAS, submitted.
  4. (astro-ph/0608623)
  5. A Bayesian approach to estimating time delays between gravitationally lensed multiple images
    Harva Markus O. and Raychaudhury Somak, 2006, MNRAS, submitted.
  6. Refereed Conference Proceedings

  7. An Evolutionary Approach to Modelling Radial Brightness Distributions in Elliptical Galaxies
    J. Li, X. Yao, C. Frayn, H. G. Khosroshahi and S. Raychaudhury, 2004, Lecture Notes in Computer Science 3242, 591-601. Berlin: Springer-Verlag. (ISBN:3540230920), Proceedings of the 8th International Conference on Parallel Problem Solving from Nature (PPSN VIII), September 2004)
  8. Mapping the physical properties of cosmic hot gas with hyper-spectral imaging
    Ela Claridge, Mark O'Dwyer, Trevor Ponman, and Somak Raychaudhury, 2005, in IEEE Workshop on Applications of Computer Vision (WACV2005), January 2005, pp 185-190. (astro-ph/0505165)
  9. Finding Young Stellar Populations in Elliptical Galaxies from Independent Components of Optical Spectra
    Ata Kaban, Louisa Nolan, Somak Raychaudhury, 2005, SIAM International Conference on Data Mining (SDM05) 21--24 April 2005, Newport Beach California, USA. (astro-ph/0505059)
  10. Bayesian Estimation of Time Delays Between Unevenly Sampled Signals
    Markus Harva and Somak Raychaudhury, 2006, accepted by IEEE International Workshop on Machine Learning for signal processing, September 2006, Maynooth, Ireland
  11. A kernel-based approach to estimating phase shifts between irregularly sampled time series: an application to gravitational lenses
    Juan C. Cuevas-Tello, Peter Tino and Somak Raychaudhury, 2006, accepted by The 17th European Conference on Machine Learning (ECML06), September 2006, Berlin
  12. Factorisation of positive valued functions
    Ata Kaban, Louisa Nolan and Somak Raychaudhury, 2006, accepted by ICA Research Network International Workshop (ICArn06), September 2006, Liverpool
  13. On class visualisation for high dimensional data: Exploring scientific data sets
    Ata Kaban, Jianyong Sun, Somak Raychaudhury and Louisa Nolan, 2006, accepted by Ninth International conference on Discovery Science (DS-2006), October 2006, Barcelona, Spain (Publisher: Springer)
  14. Un-refereed Conference Proceedings and Posters

  15. Time delay estimation in gravitational lensing: a new approach
    Cuevas-Tello Juan C., Raychaudhury, Somak, Tino Peter, 2005, abstract in the proceedings of the RAS National Astronomy Meeting, Birmingham, April 2005.
  16. Finding young stellar populations in early-type galaxies from independent components of their UV-optical spectra
    Nolan, L., Kaban, A., Raychaudhury, Somak, 2005, abstract in the proceedings of the RAS National Astronomy Meeting, Birmingham, April 2005.
  17. Determining time delay in gravitational lending: how significant are the results?
    Tino, Peter, Cuevas-Tello Juan C., Raychaudhury, Somak, 2005, abstract in the proceedings of the RAS National Astronomy Meeting, Birmingham, April 2005.
  18. Kernel-based methods applied to irregularly sampled time series
    Cuevas-Tello Juan C., Tino, Peter, Raychaudhury, Somak, 2005, Poster in "The Analysis of Patterns", Centre Ettore Majorana for Scientific Culture, Erice, Italy October 28 - November 6, 2005.