Introduction

As archaeologists deal with an increasing number of datasets (both new and reused) and want to analyse larger quantities of data, data science can provide some of the necessary skills and tools. There are three main aspects: programming languages (Python and R being the most commonly used), machine learning and statistics. The resources provided in this section are mainly to give researchers the basics with useful resources that will enable further development of these skills, since data science covers a wide range of data-related activities. The list starts with some training websites and directories containing multiple courses that can be perused followed by some individual recommended starter level resources.

Title Description Source (URL) Type Level
Datacamp Datacamp provides online learning courses and is available as a website and mobile app. It covers the most common programming skills such as Python, SQL and R as well as using scripting and spreadsheets, and other technologies. Users can search for courses by topic and there are also case studies available. Datacamp website & mobile app. (Apple & Android) Online courses. First chapters free, otherwise monthly subscription required. All levels
Towards Data Science Website resource on all aspects of Data Science with articles on specific topics, described as an eco system for end users. Towards Data Science Website with many contributions from Data Scientists on individual topics such as Visualisation. All levels
The Programming Historian Excellent website with several courses on commonly used programming languages, techniques and tools for analysing Humanities data. The Programming Historian Open Access online tutorials on website. All levels
SSHOC Training Toolkit Various (mainly 3rd party) courses and training sources for Social Scientists and Digital Humanists which include some programming courses. SSHOC Website directory of training resources. All levels
Programming
Python Python is an easy to learn, powerful programming language favoured by Data Scientists. which is easily installed. The documentation enables everyone to learn and use the basics through to more complex aspects, all for free. Python.org Official documentation website with a tutorial and examples Basic – Intermediate
Introduction to Python programming Free course from Udemy in bitesize chunks – given by youthful Avinash Jain who makes each step as easy as possible using the PyCharm tool. Udemy Presentation – audio with screenshots. Basic
The Ultimate R Guide For Data Science Excellent introduction to R with recommended resources by Oleksii Kharkovyna who provides a step-by-step to the background, installing the necessary software and some courses for learning the basic syntax. Towards Data Science Article with several links to other resources. Basic
R for Data Science This book by Garrett Grolemund and Hadley Wickham “is to help you learn the most important tools in R that will allow you to do data science.” I.e. how to get your data into R, get it into the most useful structure, transform it, visualise it and model it. R for Data Science Book – free online (print version O’Reilly). Basic
Introduction to Data Science. Data Analysis and Prediction Algorithms with R Another good introduction to using R covering programming, visualisation and statistics which started out as the HarvardX Data Science Course notes. Different aspects are explored through the use of case studies and data wrangling, machine learning and useful tools are also covered. GitHub Book – free online and also in other formats (see Source). Basic
Statistics
The Elements of Statistical Learning This book by Trevor Hastie, Robert Tibshirani, and Jerome Friedman (Springer) covers the statistical methods that are used for activities such as data mining and which help researchers to interpret their results. Stanford University: PDF Free PDF version of book Basic – Intermediate
Statistical Learning Introduces some of the main tools used in statistical modelling and data science, covering both traditional as well as new methods, and how to use them in R. edX Website Basic
Machine Learning
Data Science 101 – Machine Learning Tutorials Beginner guide for anyone who wants to study data science and make their own machine learning models. App Mobile app. for use with a computer. Android only. Basic – Intermediate
Machine Learning Levels that a 5yr old can understand Article providing a 101 level overview of Machine Learning Models with diagrams. TNW Website Website article Basic
Useful Tools
18 Essential Software Every Data Scientist Should Know About This article summarises a collection of data science tools that cover SQL and similar database applications, visualisation, data scraping, programming languages and Integrated Development Environments (IDEs). Geekflare Article with several links to other resources. Basic – Intermediate
The 17 Best Free Tools for Data Science Article more focussed on programming, this covers languages (R, Python and SQL), software packages and libraries plus some tools and also some free learning resources. Data Quest Article with several links to other resources. Basic
Top Tools for Data Scientists: Analytics Tools, Data Visualization Tools, Database Tools, and More Comprehensive overview of 50 tools and packages available mainly for free (plus some paid for). NG Data Article with several links to other resources.  Basic
Orange Orange is a tool that makes data science fun and interactive. Orange allows users to analyse and visualise data without the need to code. It also offers machine learning options for beginners. Orange Downloadable tool for Windows, MacOS and Linux. Basic – Intermediate
Jupyter Notebook/ JupyterLab The Jupyter Notebook is an open-source web application that allows you to create and share documents that contain live code, equations, visualisations and narrative text. Uses include: data cleaning and transformation, numerical simulation, statistical modelling, data visualisation, machine learning, and much more. JupyterLab is a web-based interface version. Jupyter Website where users can try out the tool and also download and install it. Tutorials also provided. Basic – Intermediate