Ministry of Rural Development has released geo-tagged data for more than 7,00,000 rural educational, agro and health facilities that were surveyed for the purpose of selection of roads in PMGSY-III. The dataset can be accessed by going to the homepage of OMMAS, selecting “Other Reports” and clicking on “Facility Details”. In that report you can download data for one state at a time. Please read the FAQs before proceeding. We invite government departments, academia, startups etc to use this data to inform policy and fill gaps in rural India. The dataset, titled “PMGSY Rural Facilities Dataset” is dedicated to government frontline engineers who are responsible for planning, construction, monitoring and maintenance of rural roads under PMGSY and also spearheaded the collection of this dataset. Frequently Asked Questions about PMGSY Rural Facilities Dataset 2020 Q1. For what purpose was this data originally collected for and who collected it ? A. The data was collected for purpose of selection/ranking roads for PMGSY-III. The data was collected by frontline government road engineers or in some cases contracted out to third party. The selection mechanism can be read in detail in the PMGSY-III guidelines ( https://pmgsy.nic.in/sites/default/files/PMGSY_III_guidelines.pdf ) Q2. What file format is the dataset? The dataset can be downloaded as excel workbooks or pdf formats. Q3. How many Facility Categories are there? A. There are total 4 categories: Medical, Agro, Education and Transport/Admin. Q4. What is the unit Habitation & what does the Habitation code represent? A. “Habitation” is the lowest geographical unit which is unique to PMGSY and inherited from the first phase of the program which had the objective of providing connectivity to all unconnected rural habitations with an all-weather road. Some states have started using revenue villages for habitations for PMGSY-II onwards but they unit is still referred to as habitations. The id is the internal primary key used to identify habitations and not related to LGD or Census 2011. Q5. Why is XYZ facility missing in the dataset? A. Facilities belonging to urban areas are not surveyed. Otherwise, a facility in a rural habitation may be missing in this dataset because of reasons explained in process of data collection. Q6. Under what license has the data been released? What are the terms and conditions? A. The data has been released under Government Open Data License, India. The terms and condition can be read here: https://data.gov.in/government-open-data-license-india. Q7. How should this dataset be cited? A. Users should cite this data as “PMGSY Rural Dataset, 2020 https://omms.nic.in/Home/PMGSYRuralDataset/” Q8. Which facilities were surveyed? A. The list of facilities which were to be surveyed as per guidelines of the scheme can be seen on Pg 37 of the PMGSY-III Guidelines ( https://pmgsy.nic.in/sites/default/files/PMGSY_III_guidelines.pdf ) Eg. High Schools, Higher Secondary Schools, Vet Hospitals, PHCs, CHCs, Bedded Hospitals, Bus Stands, Block HQs, Panchayat HQs, Banks, Fuel Stations, Cold Storages, Agro Industries, Pack Houses, Collection Centres etc. Q9. How was this data collected? A. A common mobile application was developed by C-DAC which was used by field engineers/third party consultants to undertake the survey. A technical training on the application was conducted and basic guidelines were provided as to how to conduct the survey. The states were allowed to interpret the definitions of facilities as long as they remained in the overall categories defined by the PMGSY guidelines. Eg. Some states have chosen to consider taxi stands as well in place of bus stands as taxis are the primary mode of transport in the region concerned. Similarly, agricultural industry is a very context specific definition. Some states have chosen to survey public as well as private facilities whereas others have limited to public facilities only. Q10. Can this dataset be used for comparing rural facilities across states? A. Any comparison should be done with understanding of the following constraints: There were no strict definitions/terminology employed, the primary data collectors are not trained enumerators in most cases and accuracy may vary across blocks/districts/states. Q11. Were the facilities surveyed audited? A. A maker-checker mechanism was instituted within the state level and sample of facilities were audited centrally for accuracy. This may not mean that all facilities surveyed are accurate or exhaustive. Q12. Why is data for certain geographies entirely missing? A. The dataset is collected for the purpose of PMGSY-III and states are eligible for the government scheme after meeting certain conditions. Not all states/UTs have been onboarded at present. Q13. Where can I access the digitized roads and habitations under PMGSY? A. As of now you can only view them at pmgsy-grris.nic.in/. That data hasn’t been opened yet. Q14. Certain facilities don’t have lat/long attached? A. This would mean that the survey is not complete yet in the inspected block as all facilities to be used for the primary purpose of the program need to be geo-tagged through the common mobile application. Q15. Certain facilities seem to be outside the geographic extent of the country/state/district etc. Why is that so? A. The common mobile application uses the GPS coordinates as provided by the mobile used by the surveyor. The accuracy depends on the handset, warm-up period and region in which the facilities are being geo-tagged.