Ever wonder what you should do with old pictures? Perhaps you digitized them and then stuck them on a hard drive somewhere only to be forgotten?
Well if you are the Harvard College Observatory you realized that over a century's worth of star gazing might very well lead to new discoveries. At least that is the hope of the monumental effort that culminated in March 2024 (Jan 2025 Publication Announcement) with the completion of the DASCH archiving project.
What Are Astronomical Plates?
Astronomical photographic plates are glass plates coated with a light-sensitive emulsion, historically used to capture images of the night sky before the advent of digital imaging. These plates served as the primary medium for recording astronomical observations from the late 19th century through much of the 20th century.
Each plate in the collection is a tangible record of the sky at a specific moment in time, preserving invaluable data on celestial objects and their historical positions and brightness. Harvard University maintains the most extensive collection of glass astronomical plates in the world through the Plate Stacks project.
What is the DASCH Project?
The Digital Access to a Sky Century at Harvard (DASCH) project is a monumental initiative undertaken by the Harvard College Observatory to digitize its extensive collection of astronomical photographic glass plates.
Spanning observations from 1885 to 1992, this collection comprises over 550,000 plates, making it the largest of its kind globally. The primary objective of DASCH is to preserve these invaluable historical records and make them accessible for scientific research, particularly in the field of time-domain astronomy, which studies how celestial objects change over time.
Launched in 2004 under the leadership of Principal Investigator Jonathan E. Grindlay, DASCH faced the significant challenge of efficiently digitizing such a vast archive. When commercial scanners proved inadequate due to their slow processing speeds, the team designed a custom high-speed scanner capable of handling the delicate glass plates.
Over two decades, the dedicated DASCH team successfully digitized 435,763 plates, with the final scan completed on March 28, 2024. The impressive team included over 120 individuals and institutions including students, professors and volunteers.
The culmination of this extensive effort is DASCH Data Release 7 (DR7), made available on December 29, 2024. DR7 offers a comprehensive dataset that enables researchers to study the entire night sky over a century-long timescale.
Image Source: Starglass project website
Portal for the Public
To facilitate public and scientific access to this wealth of data, the DASCH team launched StarGlass an online portal that allows users to explore the digitized plate collection as well as a comprehensive Web REST API.
Through StarGlass, individuals and researchers can search for objects by name or coordinates, view images captured on specific dates, and delve into the historical context of the observations, including information about the pioneering women astronomers, known as the "Harvard Computers," who contributed significantly to the collection.
Image Source: Starglass website categories
Enabling New Research Frontiers
Researchers interested in accessing the Digital Access to a Sky Century at Harvard (DASCH) digital archives have several options, each tailored to different needs and technical proficiencies.
This access will allow citizen and professional researchers to contribute to uncovering new insights in the night sky over the last century. The integration into Jupyter workbooks (daschlab) and a comprehensive REST API will allow Machine Learning and Ai processing potentially unlocking new discoveries.
daschlab
The primary and recommended tool for accessing and analyzing DASCH Data Release 7 (DR7) is daschlab, a Python toolkit designed for efficient data retrieval and analysis. This toolkit offers cloud-based Jupyter notebooks, enabling users to perform basic data operations without the need for extensive Python knowledge or software installation. These notebooks provide an interactive environment to explore DASCH data seamlessly. To get started with daschlab, refer to the DR7 documentation.
Daschlab
For users who prefer programmatic access or wish to integrate DASCH data into custom applications, DASCH offers a suite of web API endpoints. These APIs provide RESTful, JSON-oriented interfaces to retrieve DR7 data, allowing for flexible and automated data access across various programming environments. Detailed documentation and additional resources for these APIs are available on the DR7 Web APIs page. As well as additional API docs found here.
Web API Documentation for Starglass
StarGlass is an online portal designed to facilitate exploration of the Harvard College Observatory's plate collection. While it does not provide access to DASCH lightcurve data, StarGlass offers plate photographs and full-plate FITS mosaics. Users can search for specific plates, view detailed metadata, and download high-resolution images.
By creating a StarGlass user account, researchers can obtain an API key, granting higher rate limits for data requests, including mosaic retrievals. This feature is particularly beneficial for users requiring bulk data access or integration into automated workflows. https://starglass.cfa.harvard.edu/
Starglass User Portal for Searching Plates
Data Access Restrictions
As of 2024, all previous restrictions on DASCH data access have been lifted, granting researchers full access to the entire dataset. This unrestricted access empowers the scientific community to conduct extensive research using over a century's worth of astronomical observations.
Community Engagement
To stay informed about updates, discussions, and support related to DASCH data analysis, researchers are encouraged to join the DASCH Astrophysics email list. This platform serves as a hub for collaboration, announcements, and assistance, fostering a vibrant community of users and contributors.
Enabling New Kinds of Research
With over a century worth of astronomical data now in digital format, the DASCH project will enable research with cutting edge tools that were simply the stuff of science fiction when the project started.
For example, using Ai researches will now be able to unlock new insights in the first long-term temporal variability survey on days to decades time scales. They will be able to study the changes in brightness of objects over a timespan which was previously unavailable. Changes in brightness can unlock evidence of exoplanets, rare astronomical occurrences, and just maybe evidence alien life? While we probably won't find any spaceships on these plates, the excitement for new discoveries is just as high.
The Data Pipeline
As a seasoned software developer I was particularly impressed with the processing pipeline. The time span of the project necessitated the evolution of tools and procedures without compromising previous work, not a simple task over the period of 20 years!
Digitizing the extensive Harvard College Observatory's photographic plate collection involved a meticulously designed data processing pipeline (including custom hardware) to handle the unique challenges presented by historical astronomical data state and its varying quality.
Data Processing Pipeline Overview
The DASCH pipeline was developed to convert scanned images of photographic plates into precise, scientifically valuable data. It was a tedious and challenging task that included:
Image Extraction: Utilizing Emmanuel Bertin's SExtractor program, the pipeline identifies and extracts star images from the digitized plates.
Astrometric Calibration: Astrometry.net determines the plate centers and orientations with a 99.5% success rate, resolving discrepancies between recorded and actual positions. Jessica Mink's WCSTools further refines these positions by matching them to the Tycho-2 catalog, accounting for stellar proper motions. SCAMP software then applies polynomial fitting to correct for distortions inherent in the original telescope optics.
Photometric Calibration: The pipeline divides each plate into nine annular bins to address variations in image quality due to vignetting and optical distortions. A locally weighted scatterplot smoothing (lowess) algorithm calibrates the instrumental magnitudes extracted by SExtractor against reference catalogs such as GSC2.3.2, Kepler Input Catalog, or APASS, adjusting for differences in color response between the photographic emulsions and modern detectors.
Defect Filtering: A defect filter analyzes the shape parameters of detected objects to distinguish genuine stellar images from artifacts like dust, scratches, or development flaws on the plates.
Multiple Exposure Handling: Some plates contain multiple exposures to extend dynamic range or include calibration fields. The pipeline identifies these by iteratively matching detected objects to catalog positions, flagging overlapping images as blended or uncertain when necessary.
The team's dedication to its completion is a testament to the innovation they believe it could unlock. I look forward to seeing what new insights we can unlock from our beginnings in the study of the world beyond ourselves!
Harvard Completes 100 years of Astronomical Data Archive: Enabling New A.I. Research Into the Stars