Geospatial and data mining approaches to assess the impact of watershed development in Indian rainfed areas

Date of Award


Degree Type


Degree Name

Doctor of Philosophy


School of Science

First Advisor

Dr Leisa Armstrong

Second Advisor

Dr Amiya Kumar Tripathy


Watershed development programs in India have played a significant role in improving the livelihoods of the rural communities living in rainfed areas. Current assessments are limited in assessing interrelated impacts as the watershed development is influenced by multi domain areas. Few studies have reported on the novel ICT techniques being used for watershed assessment with actual watershed data or examined the spatial or temporal variations in the watershed.

The objective of the research was to study current novel geospatial and data mining methods used in hydrological assessments of watershed development and to apply the identified novel methods on a real-time watershed data. The following major research question has been addressed by the research study: “Can novel geospatial and data mining methods be effectively used to assess the hydrological impacts of watershed development? In order to answer this question, the research was carried out in a number of phases to examine existing ICT techniques utilised for impact assessment of the watershed area.

The research methodology used in this study was a mixed method approach based on case study, diagnostic research and quantitative approach. Two contiguous watersheds in rainfed region of Andhra Pradesh, India was chosen as study area. Data sets were sourced from a number of Government and NGO agencies and field visits. Data representing sixteen parameters of hydrological, environmental and social factors which were known to influence watersheds were chosen for the study. The data consisted of both spatial and spatio-temporal data. A grid of 2880 cells covering the study area was developed. Data for the period 2006 to 2008 in two seasons (pre-monsoon and post-monsoon) were collected, compiled, classified and assigned to the grid network database. The study area was divided into three zones viz., upstream, midstream and downstream. The data underwent preprocessing in order to make it suitable for further data analysis. This included watershed delineation, creating grid network, handling point data, line data and polygon data, and formatting data into unified format. The data was converted into nominal classes to be utilised for data mining.

The watershed data set was analysed using descriptive statistics, geospatial and data mining approach. The first analysis used descriptive statistics based on univariate analysis using pivot tables wizard. This analysis used all sixteen watershed parameters. A series of different scenarios for soils, ground water levels, landuse and checkdams were examined. The second approach was a geospatial analysis which used optimised hot spot analysis. The analysis used NDVI, ground water levels data as the input parameters. The data was examined in relation to landuse and location of checkdams. The third approach employed spatial data mining techniques by using DBSCAN clustering and Apriori rule based association rule mining techniques on watershed data. The analysis used fourteen spatio-temporal parameters. The output from the analysis was visualised using a GIS environment.

A comparison of the results from the three approaches showed that all the three approaches provided some insight into the understanding of factors influencing the watershed development. The descriptive statistics provided a simple analysis of trends of the parameters. It was limited in its ability to show the interrelationships between parameters. The geospatial analysis of the watershed area was useful in understanding the spatial and temporal trends across the watershed area. This analysis can only be used for spatial data with numeric values. The data mining analysis of the watershed area was useful in understanding previously hidden relationships between the parameters influencing the watershed area. This analysis could be used for both spatial and spatio-temporal data analysis. The results obtained through each analysis approach require some expertise to interrogate the effects of changes in the watershed area. The relationships are complex and interrelationships are influencing the effects of parameters. Variation was found in the granularity of the outputs of each approach. It is evident that a combination of the approaches provided the capability to investigate these from general data trends to complex data analysis.

Validation of the approaches was made with a similar study carried out by ACIAR funded project. Some validation could be made of the findings from this thesis with the ACIAR based studies. The importance of factors such as groundwater level, watershed zone and rainfall was noted in both studies. Although the ACIAR research was conducted in similar study area, it was limited in its analysis of the effects of upstream/downstream interactions and did not study on the integration of multiple parameters in a robust manner.

The research was considered novel in the integration of three different approaches for watershed impact assessment utilising hydrological, socio and environmental parameters for a contiguous watershed data with a spatial and temporal analysis. It was also novel in that it proposed hybrid method of utilising Geospatial analysis and data mining methods together and visualising the output of data mining in a GIS environment.

This research proposed a novel integrated technology based framework for impact assessment which comprises datasets, processing, analysis and results components. This framework could be used to develop it as a decision support tool to assess the impacts of watershed development to assist researchers and planners to provide unbiased assessment of the impact of the watershed development from a range of perspectives. The framework can be used at different spatial and temporal scales.

Access Note

Access to this thesis is embargoed until 2nd August 2024.

Access to this thesis is restricted. Please see the Access Note below for access details.


Paper Location