DATA 150: Human Development/Data Science

Course ID: DATA 150
Course Attribute: COLL 150
Title: Evolving Solutions: Human Development/Data Science
Credit Hours: 4
Meeting Times: 9:30 to 10:50 TTh
Location: Remote Synchronous Off-Campus
Date Range: Spring Semester 2021

Course Description

This course is an introductory exploration into the intersection of global human development with data science. Each student will be introduced to relevant development topics as they pertain to high resolution global: population, natural resource and built environment description. Students will explore current work on global development surveys for demographic analysis, as well as the use of census and call detail records to describe human movement. Additionally, students will develop an annotated bibliography, write a literature review, form a central research question and seek to identify a research gap. The course concludes with a group presentation and individual research proposal that highlights and summarizes the most significant findings from the semester. Pre-requisite(s): None

Goals and Objectives:

  • To learn about the research process and how it can be used to build a knowledge base to support your writing.
  • To investigate a subject matter by defining its boundaries, identifying significant contributing areas of interest and discarding non-essential and/or non-contributing information.
  • To use a body of knowledge as the basis for writing a research paper, including the formulation of a central research question
  • To produce a scientific article, including a literature review, methodological discussion, citations, bibliography, abstract and information about the author.
  • To challenge students to become apprentice scholars and identify a central research focus of interest.

Honor Code

Among our most significant traditions is the student-administered honor system. The Honor Code is an enduring tradition with a documented history that originates as far back as 1736. The essence of our honor system is individual responsibility. Today, students, such as yourself, administer the Honor pledge to each incoming student while also serving to educate faculty and administration on the relevance of the Code and its application to students’ lives.

The Pledge

“As a member of the William and Mary community, I pledge on my honor not to lie, cheat, or steal, either in my academic or personal life. I understand that such acts violate the Honor Code and undermine the community of trust, of which we are all stewards.”

Accessibility, Attendance & Universal Learning

William & Mary accommodates students with disabilities in accordance with federal laws and university policy. Any student who feels s/he may need accommodation based on the impact of a learning, psychiatric, physical, or chronic health diagnosis should contact Student Accessibility Services staff at 757-221-2509 or at sas@wm.edu to determine if accommodations are warranted and to obtain an official letter of accommodation. For more information, please see www.wm.edu/sas.

I am committed to the principle of universal learning. This means that our classroom, our virtual spaces, our practices, and our interactions be as inclusive as possible. Mutual respect, civility, and the ability to listen and observe others carefully are crucial to universal learning. Active, thoughtful, and respectful participation in all aspects of the course will make our time together as productive and engaging as possible.

Grade Categories

exceptional A = 100 ≥ 97.0excellent A = 96.9 ≥ 93.0superior A- = 92.9 ≥ 90.0
very good B+ = 89.9 ≥ 87.0good B = 86.9 ≥ 83.0above average B- = 82.9 ≥ 80.0
normal C+ = 79.9 ≥ 77.0average C = 76.9 ≥ 73.0sub par C- = 72.9 ≥ 70.0
below average D+ = 69.9 ≥ 67.0poor D = 66.9 ≥ 63.0very poor D- = 62.9 ≥ 60.0
failing F < 60.0

note .9 = .9 with bar notation

Grading Opportunities

Three informal presentations20%Due every third Friday
Weekly informal responses20%Due on a regular basis
Four formal writing assignments40%Due every third Saturday
Final project20%due by 5PM on the last day of finals

Semester Schedule

Week 1 (9/1)

  • Thursday:
    • Introductions
      • Zoom, Slack, Blackboard & the Syllabus
    • For next time:

Week 2 (9/5)

  • Tuesday:

    • Using Markdown & Github
    • Don’t forget people in the use of big data for development, Joshua Blumenstock. Nature: Sept. 10, 2018
    • Response questions:
      • Joshua Blumenstock states that a humbler data science could transform international development while also limiting the number of alleged silver bullets that have missed their mark in recent years. Describe the promise, pitfalls and ways forward Blumenstock uses as the foundation for his thesis. Additionally, consider the following statements from three of your classmates regarding this article. (1) "Good intent is not enough in data science when dealing with the problems which determine people's experiences" Anna Raymond (2) "Transparency is the underlying issue to many of these problems, so an increase in this on both ends (data based issues & human based issues) could lead to better results." Nira Nair (3) "In lieu of such drastic potential for promoting applications yet demoralizing hinderances, the balancing act can become difficult." Kayla Seggelke How do you respond to these ideas regarding "good intent", "transparency" and the difficult "balancing act" when considering the intersection of human development with data science? Please prepare your response for next Thursday's class.
  • Thursday:

    • The Best Stats You've Ever Seen, Hans Rosling. TEDx: February, 2006
    • Response Questions:
      • What was Hans Rosling's observation regarding his comparative survey of students at the Karolinska Institute and the Chimpanzees (as well as the faculty who decide the Nobel Prize)? What is the significance of the results from his informal survey on preconceived ideas?
      • What type of change took place in Asia that preceded economic growth? Why was this type of change significant?
      • In accordance with Hans Rosling’s TED talk, what is the relationship between child mortality and GDP per capita?
      • In terms of income distribution, how has the world changed from 1962 until 2003?
      • What is the significance of how Hans Rosling uses data to describe global human development in terms of very high spatial and temporal resolutions? How does this relate to his previous observation regarding preconceived ideas?
      • In your opinion, why was Hans Rosling’s work with the Gapminder project significant in contributing towards advancing the intersection of data science and global human development?
    • For next time:
    • Friday: add/drop period ends

Week 3 (9/12)

Week 4 (9/19)

Week 5 (9/26)

Week 6 (10/3)

Week 7 (10/10)

Week 8 (10/17)

Week 9 (10/24)

Week 10 (10/31)

Week 11 (11/7)

Week 12 (11/14)

Week 13 (11/21)

  • Tuesday:
  • Thursday:
    • Planning final presentations
  • Sunday: Methods paper due at midnight

Week 14 (11/28)

  • Tuesday:
    • Workshop
  • Thursday:
    • Final Presentations

Week 15 (12/5)

  • Tuesday:
    • Final Presentations
  • Thursday: Last day of class
    • Final Presentations

Final

  • Research plan is due no later than December 21st by 5PM.

Assignments

Assignment 1: Write an Annotated Bibliography

Summary

In this assignment you will begin your exploratory investigation into a geospatial human development process. You will propose a research topic and select a region or country to begin your exploratory investigation into a dimension of human development. Then you will need to research and select five articles that use geospatial data to answer a scientific question about human development. Once you have identified your articles, annotate each one, by identifying the most important ideas that have been addressed in the work as well as by answering a series of questions about the intersection of each source with the main themes of this course. Through the development of these annotations you will begin to establish a boundary for your research while describing and defining the landscape of knowledge that populates your selected area of investigation. Your annotated bibliography is the starting point for describing the problem you will investigate in terms of harms, significance, and inherency, as well as a means for clarifying the parameters of your inquiry.

Instructions

Begin by proposing a research topic you would like to investigate this semester. Select a topic that includes the use of data science methods and applications to describe, analyze, model and/or simulate a geospatial human development pattern and/or process. Following are some suggested human development topics where data science methods are being actively developed.

  • Disaster management and response
  • Socioeconomic Analysis
  • Precision Epidemiology and disease burdern estimation
  • Migration
  • Urbanization
  • Infrastructure and social service
  • Accessibility modeling
  • Transportation modeling
  • Poverty assessment and analysis
  • Resource management and allocation
  • Food security
  • Environmental Impact Assessment
  • Climate change
  • Land use

Making an appointment with a research librarian at Swem in order to receive assistance and individual attention is a good place to start. Be sure to share your assignment with your Research Librarian in advance of meeting, so she/he will be best prepared at the time you meet.

Swem Library Appointment

Publications made available on the flowminder and worldpop websites are also both good places to start with your investigation. After selecting your four articles, annotate each one, by identifying and describing the most important ideas presented in each work. Identify and describe the significant harm (development problem) each article is seeking to address. Also identify and describe the inherent and complex nature of the development problem as it pertains to the region or country of investigation. Additionally, answer the following questions.

  • How does this article relate to Amartya Sen’s definition of human development?
  • Which dimension of human development is being addressed by the authors’ research?
  • Which sustainable development goals can be considered in relation to your selected article?
  • Which (geospatial) datasets are used by the authors?
  • Which (geospatial) data science methods are used by the authors?
  • Which human development pattern or process are the authors investigating?
  • What is the scientific question the authors are seeking to answer?

After selecting a research topic and annotating each source propose a region, country or countries that you think will be interesting for further investigation. Identify and describe the local nature and complex challenges associated with your selected human development problem within the context of that location.

Deliverable

Type your annotated bibliography. Your annotated bibliography should be at least 1600 words in length and include a minimum of 4 different sources. Title your annotated bibliography and be sure to include your name. Note your word count on the first page. Do not include your title or reference information when calculating your word count. Please post your annotated bibliography as link on the index of your GitHub pages site no later than 6PM, Saturday, October 2nd by 5PM.


description: Write a Literature Review

Assignment

Summary

In this assignment you will use your annotated bibliography as the beginning point to write a literature review on your selected human development topic. Your literature review should amount to more than reporting what you’ve read, but instead should demonstrate understanding of your selected area of research. It should define the boundaries of your inquiry, identify some of the most significant works, and contextually serve as a starting point for positioning your research. A good literature review serves to justify your research through a critical analysis of the material.

Instructions

  1. Start your literature review by selecting a human development topic that you found interesting while exploring the material from flowminder and worldpop. Also select a region or country where you think investigating your selected topic will be inherently relevant due to local circumstances. Draft a title for your forthcoming literature review that includes your selected dimension of human development and the region or country of interest. Begin by writing a few sentences that describe how your selected topic relates to Amartya Sen’s definition of human development.
  2. Identify one or more of the data science methods that you found to be significant while annotating the four sources from Assignment 1. Also identify one or more of the datasets you found to be significant as part of your annotated bibliography. Do your best to name and describe the (geospatial) data science methodology (or methodologies) and identify the source of your data (or datasets).
  3. Research your selected human development topic and identified data science method(s). A good place to start is William & Mary’s online library database. You may want to approach the Swem research desk and ask for guidance on how to identify new, relevant and useful sources. It may also be helpful to schedule an appointment with a research librarian. Be sure to provide the research librarian with your topic, methods of interest, datasets and region of investigation in advance so she or he has some idea of how to assist prior to meeting.
  4. Select at least four new sources that contribute to understanding the current state of your selected human development topic. You are welcome to select a source that describes your selected topic from a more global perspective, but please keep in mind at this step in the research process we are not interested in proposed solutions to a development problem. Sources that focus on describing, analyzing and modeling your selected human development topic in terms of patterns or processes will likely be more helpful. Field studies that couple methods and data with on-site observations and/or verification of information could also be contextually useful. Identify new sources that are central to your focus and contribute to increasing the depth of your investigation. You are welcome to use new sources from flowminder or worldpop so long as each one is in addition to the four from your annotated bibliography. Upon selecting each new source, again annotate the article and add it to your bibliography.
  5. Begin writing your literature review by first drafting an outline that reflects a deconstruction of the annotations from your bibliography, and then a subsequent hierarchical, re-assimilation of associated ideas. Build your outline so it structurally reflects your selected human development theme, sub-themes and (geospatial) data science method(s). You may want to dedicate one section of your outline to describing your dimension of human development from a global perspective and then contextually introduce your selected region or country. You may then want to dedicate a subsequent section of your outline to a relevant sub-theme that also serves to introduce a data science method and its application. Thematically populate your outline with your annotations, associating and synthesizing similar ideas. Critically analyze the material where possible.
  6. Begin to write your literature review by transforming the information contained within the structure of your outline into an integrated, fluid and coherent paper. Be sure to add a section that introduces your literature review. For example, you could introduce your work by providing a general analysis that considers your topic from a comprehensive perspective as well as in terms of your area of regional focus. You could also introduce your research by defining the parameters of your inquiry. As you proceed to sections that are the main body of your work, seek to provide insight and add value through progressively synthesizing findings from your sources. After completing a first draft, you may need to find one or two more additional sources that serve to advance your research focus and the depth of your work in terms of critical analysis. While researching and developing your literature review consider the following questions.
    1. Are you able to describe and define how your selected human development process behaves as a complex adaptive system? Are you able to identify social, economic or environmental features associated with your human development process?
    2. Are you able to identify features associated with your human development process that are difficult or nearly impossible to predict in detail? Alternatively, are you able to identify broader system properties or features that are more feasible to predict?
    3. Are you able to identify an emergent property inherent to your human development process that does not necessarily link to an individual specific agent? Are you able to describe how your human development process tends towards greater complexity and away from equilibrium?
    4. In your opinion, which scientific questions are most relevant to your selected human development process?
  7. Conclude your literature review with a one or two paragraph reflection on your work to date. Follow this summary by attempting to identify a gap in the literature that you think needs addressed. Also draft a central research question that your investigation into a human development process will seek to answer.

{% embed url="https://libraries.wm.edu" %}

{% embed url="https://libraries.wm.edu/appointments/" %}

{% file src="../.gitbook/assets/lit_review_guide.pdf" caption="Literature Review Guide" %}

Deliverable

Type your annotated literature review. Title your document, add your name and cite your sources. Identify each reference in a bibliography at the end of your document. Your review should be at least 2000 words in length. I will not include the title or reference information when verifying your work has met the word count. Note your word count on the first page. Do not include your title or reference information when calculating your word count. Please print your literature review and place it in the box outside my office door no later than 10PM, Saturday, October 19th.

Rubric

Rubric for Assignment 2

Assignment

Summary

Data science is primarily concerned with the use of data to describe events occurring in our natural or social world and a statistical method that estimates relationships, interactions and causes in order to analyze conditions and infer possible future outcomes. Thus far you have identified a harmful, significant, and quantifiably measurable problem associated with a dimension of human development and described the inherent and complex nature of that problem through a critical analysis of the literature. Now you will select at least two geospatial data science methods that describe your problem, model relationships between variables, analyze interactions and produce findings that provide in part answers to your formulated central research question. You will need to describe the data used in each journal article, interpret each model and finally seek to identify a gap in the literature worthy of future research.

Instructions

  1. Begin by writing a comprehensive introduction to your selected human development topic. Identify salient harms, quantify their significance and describe the inherent and complex nature of the geospatial human development process you are investigating. Identify the geospatial data science method you have selected and describe it. State your broad central research question and relate how your selected method has been used to describe, analyze and/or model your selected geospatial human development process.

  2. Select the type of inquiry your investigation into a geospatial human development process will seek to answer. Consider the following categorical types of inquiries into human development (see http://www.meshguides.org/guides/node/468).

    1. An exploratory inquiry seeks to find out what is happening through investigating a social process or phenomenon as a kind of developmental puzzle. Your research focuses on how a process or system has developed and as part of that exploratory inquiry you are likely to generate sub-questions to consider for additional research.
    2. A descriptive inquiry seeks to provide a profile of individuals, events or situations and can be considered as a kind of mechanical puzzle that describes how something works and why it works in this way. Measures of accessibility or level-of-service are evaluative planning tools that could serve to describe a current state.
    3. An explanatory inquiry seeks to understand what is happening in a situation or problem by identifying causes and effects. This explanation may involve knowledge about: why certain events take place; why things happen as they do; how things happen; and what are the processes involved. An explanatory inquiry can be a kind of comparative puzzle where considering the similarities and differences between processes or contexts can be useful to understand systematic relationships.
    4. An evaluative inquiry seeks to grade or make judgement as to the effectiveness of a particular practice. It can be a kind of causal / predictive puzzle that explores the influence of a particular factor on another factor or explores causes underlying an observed phenomenon or process.

    Support how your selected type of inquiry relates to your broader central research question in terms of scope, processes, hierarchy or dynamics. Permutate through the details of your human development process in order to identify, describe, analyze and possibly infer inherent, systematic and native intricacies. Identify at least three sub-research questions that are explicit and direct in their service of answering your broader themed focus, while also being proximate and germane to one another. Also support how your selected type of inquiry relates to your focused sub-research questions, again in terms of scope, processes, hierarchy or dynamics.

  3. Select two geospatial data science methods from your eight sources. Describe the data sources used with each of the applied scientific methods and identify tools or methods the authors used to collect their data. How was the data processed? Did the authors conduct a survey or use secondary data? Did their data have both a spatial and temporal dimension? What is the validity and reliability of the data? Add a table, chart, illustration or map in order to further reinforce your description of the data.

  4. Computationally describe what each of the two data science methods does and why each one is significant towards advancing a better understanding of your selected human development process. Describe how the authors used their data to specify their statistical model. Identify and describe the variables used in each model. Compare and evaluate the two geospatial data science methods as needed. Is one an improvement over the other through addressing a research gap in the literature? Are the two methods complimentary in answering more detailed aspects of your broader research focus? Typeset each of the two methods in their mathematical form. Again, add a table, chart, illustration or map in order to describe computational analysis and/or results.

  5. Identify the findings that each article reports as a result of applying their statistical method to the data. Describe each finding and explain how it contributes towards answering part of your central research question. Did the authors note significant correlations between two or more variables during their analysis? Feel free to speculate and discuss as to why any correlations may exist. Include non-textual elements, such as plots, charts, tables and/or maps to illustrate key findings from each of your journal articles as needed. Do any of the models have predictive power to forecast future conditions? Have the authors validated their model? How did they do so?

  6. Identify an area in the research that appears to need further consideration yet seems to have been neglected. Perhaps it is a part of your central research question that was left unanswered. A research gap could also reflect a limitation or need for improvement with a statistical method. Do the data or models fail to describe, analyze or predict some essential element from your area of research? Identify a research gap and discuss it in your conclusion.

Deliverable

Type your methodological investigation. Title your document, add your name and cite your sources. Identify each reference in a bibliography at the end of your document. Your review should be at least 1200 words in length. I will not include the title or reference information when verifying your work has met the word count. Note your word count on the first page. Do not include your title or reference information when calculating your word count. Please print your literature review and place it in the box outside my office door no later than 10PM, Saturday, November 7th.

Rubric

Part 1. Present your Results

Summary

The final assignment of this course has two parts. First, you will individually present the results of your research during our in-class colloquium on human development and data science. The colloquium will be held on Wednesday, November 18th from 9AM until noon.

Presentation

Over the course of this semester you have defined a development related problem, critically analyzed the literature about that problem, developed your central research question, interpreted at least two data science methods that answer in part that question and have suggested a research gap. Now, we will hold a colloquium where you will present your results. Each student will need to write an abstract that represents the sum of the work to be presented. Then each student will need to prepare a 10 minute presentation on their individual work. Finally, you will present your work as part of our 2-day, in-class colloquium.

Step 1. Write an abstract and give your presentation a title

Draft an abstract and give your presentation a title. Review and edit your abstract, which should be less than a page in length, single spaced. Write your abstract, including title, your name and the date you will present. Post your abstract to your index by midnight Sunday, November 15th.

Step 2. Prepare the slides for your presentation

Prepare to speak for 10 minutes and include a presentation slide that addresses each of the following elements from your research.

  • An introduction to your selected human development topic. Identify salient harms, quantify their significance and describe the inherent and complex nature of the geospatial human development process you have been investigating. Identify the type of inquiry you have selected for your investigation and justify your answer. Elucidate your research in terms of scope, processes, hierarchy or dynamics. Permutate through the details of your human development process in order to identify, describe, analyze and possibly infer inherent, systematic and native intricacies.
  • State the two geospatial data science methods you selected. Identify and describe data sources used as well as incorporated variables and methods of sensing or collection. Describe what each method does and how it advances an improved understanding of your selected human development process. Identify the most salient findings resulting from the application of your data science methods. Be sure to use formulas, tables, charts, graphs and maps as needed.
  • Identify an area of the literature that requires further investigation. Discuss how data or models fail to describe, analyze or predict some essential element of your selected geospatial human development process. Align your central research question such that it addresses your defined research gap. Position this newly formed central research question in relation to both the broader and more specific questions you had previously articulated.
  • Propose a research plan that you will implement in order to answer your central research question. Focus on the design of the plan rather than cost. Keep in mind a 1 year time frame from the point of approval and funding to achieving planned goals and objectives.

Post a final draft of your slides to slack by midnight the day before your presentation.

Step 3. Present your research during the colloquium

Come prepared to present your work at the agreed upon time. Each speaker will have 10 minutes to present their work, which will be followed by a brief question and answer session. Questions will be randomly selected from the audience, so please plan to be present.

Part 2. Write a Research Proposal

This is the second part of the final assignment. Your research proposal should meet a minimum word count of 1800 words, not including references or titles. Be certain to cite all sources and include a bibliography of references. Include figures, captions and formulas as needed. Please post a link to your research plan on your GitHub index not later than 5:00 PM on May 18th.

Research Proposal Steps (1800 words minimum)

Write a research proposal that you plan to implement in order to answer your central research question.

  1. Introduce your research idea and provide context for the starting point. Describe the open question you have identified as well as its relevance to the existing body of literature. Include perspective regarding the potential benefit from successful implementation of your research plan.
  2. Propose your solution or concept that will serve to advance answering your central research question. Describe your approach including necessary tasks that need executed in order to investigate the research problem. Postulate a novel hypothesis.
  3. Describe the objective of your inquiry. Identify major gains, obstacles or bottlenecks you anticipate. What suggestions for subsequent research could arise from the potential out-comes of the study? What will the results mean to practitioners in the natural settings of their workplace? In what way would individuals or groups potentially benefit? How will the results of the study be implemented and what innovations could emerge? https://libguides.usc.edu/writingguide/researchproposal
  4. Argue why the jury / foundation should definitively consider your research plan. Make your best argument in favor of funding for your proposal. You are welcome to compare your plan to other methods that were identified as part of your literature review or methodological investigation. Focus only on comparing the pros and cons of methods within the context of your similarly seated process oriented application.
  5. Which objections do you expect in response to your plan? Provide arguments against those possible concerns. Also include justification for your selected research design when compared to other options.
  6. Provide a brief budget that enumerates a 1 year exploratory phase that would serve as the basis for implementing your research. Consider all data costs as nominal. Consider any costs associated with time on-site, travel, accommodations or food in favorable terms.Consider that all potential partner institutions (or businesses) are generally receptive towards contributions that will serve to fulfill plan goals and objectives within reasonable limits.Consider that all costs incurred throughout your exploratory phase will need to be accounted for with receipts. Consider that you are bound to all laws of the country where the donor institution is located (the United States) as well as the William & Mary Honor Code.
  7. Keep in mind that your plan should focus on a methodological solution to your proposed central research question. This will likely involve a step forward with some aspect of the existing methodology and/or access to data needed for use in that newly conceptualized method. You may consider a $100,000 budget as the limit for your 1 year exploratory phase.

Student Work

Links to Responses

Links to Responses