An introduction to computational tools for research-- Linux command line, HPC, Bash scripting, Python, SQL, and Artificial Intelligence
Edit me

BSC4452 and BSC6451 Registration information for Fall 2024

  • BSC 4452: Class 19032

    Image of servers and tool icons

  • BSC 6451: Class 22340

Course Description

The early 2000’s were characterized as the era of Big Data, with researchers across disciplines finding that large volumes of diverse data transformed research. In the past decade, this has again been transformed by the re-emergence of artificial intelligence and machine learning systems aiding in the interpretation of Big Data. As data types and volumes continue to grow, knowledge of scripting, database management, and advanced computing, including AI fundamentals, is critical for researchers across disciplines.

This course introduces students to the tools needed to be proficient, computationally enabled researchers, providing a foundation in Linux and Bash scripting for data management, Python programming, basic SQL database fundamentals, and an introduction to artificial intelligence methods.

The course assumes no prior coding or command-line skills and covers concepts that enable students to apply new technologies to a wide array of research questions. A foundation in data management and analysis concepts opens doors for well-trained researchers and allows them to work in multidisciplinary fields.

The course assumes no prior coding or command line skills, and covers concepts that will provide the ability for students to apply new technologies to a wide array of research questions. A foundation in data management and analysis concepts opens doors for well-trained researchers and allows them to work in multidisciplinary fields.

Course Organization

The course is divided into four main sections:

  • Section 1: Linux command line and Bash scripting; version control using Git and GitHub; using high-performance computing resources
  • Section 2: Python scripting
  • Section 3: SQL database introduction and integration with Python
  • Section 4: A brief introduction to Artificial Intelligence

Course Objectives

  • Demonstrate how technology infrastructure can improve research and open new avenues of investigation.
  • Competently navigate the Unix/Linux command line interface.
  • Effectively and efficiently manipulate text files, performing complex regular expression replacements, reformatting and merging files in various ways.
  • Raise and address current issues through class participation and discussion.
  • Use AI-assisted tools to generate and debug code.
  • Use High Performance Computing resources such as the UFIT Research Computing for cluster-based analyses. Including batch scripting and running multi-processor applications (threaded and MPI).
  • Explain the basic anatomy of computer scripts/programs, with particular focus on Python scripting.
  • Construct analytical pipelines to accomplish complex tasks.
  • Describe basic database design, creation and manipulation. Perform scripted database operations for information discovery, data exploration and research data curation.
  • Have a basic understanding of research graphics formats, preparation and manipulation
  • Have a basic understanding of artificial intelligence and gain hands-on experience with computer vision.

Course Textbooks

textbook icon

The main texts for the course are:

Affordable UF badge

Each of these is available as a free PDF download or for purchase in print. Because there are no textbook costs for this course, it has been recognized by Affordable UF as an affordable course.

Course Syllabus

The full course syllabus is available on the UF SimpleSyllabus site.

Course Calendar

For readings, there may be links to pages with my notes and additional explanations on the content from the texts. The texts are abbreviated as TLCL = The Linux Command Line; Py4E = Python for Everyone.

Week Date Reading/Assignment Topic
1 Fri, Aug 23 Download Software Introduction and course objectives
       
2 Mon, Aug 26 Take UFRC New User Training
GitHub Account assignment due tomorrow
Quiz 1 available, due Wednesday, September 11
Getting started: Computers
Getting a GitHub.com account
2 Wed, Aug 28 Read TLCL Introduction & Ch 1-4 Building shell skills
2 Fri, Aug 30
Read TLCL Ch 5-8
Problem Set 1 available, due Monday, September 16

Here are some exercises to work on
       
3 Mon, Sep 02   Labor Day, no class
3 Wed, Sep 04   Continue working on the exercises
3 Fri, Sep 06 Read Notes on Regular Expressions and TLCL Ch 19
Quiz 2 available, due Monday, September 16
Text manipulation
Regular Expressions Handout
       
4 Mon, Sep 09 Read TLCL Ch 20
Optional: Watch Learn the Linux Command Line
Shell Scripts and version control with git and GitHub and git/github, Github Branching exercise
4 Wed, Sep 11
Read TLCL Ch 24-26
Quiz 1 due
Problem Set 2 available, due Friday, September 27
Git and GitHub.com
4 Fri, Sep 13 Watch the HiPerGator: SLURM Submission Scripts training Slides on git and GitHub.com and Shell Scripts and version control with git and GitHub and git/github, Github Branching exercise
       
5 Mon, Sep 16 Quiz 2 due
Problem Set 1 due
Read TLCL Ch 23
Compiling source code
Google and Documentation
Using UFIT Research Computing resources
Running batch jobs
5 Wed, Sep 18 Read Py4E Ch 1 Introduction to Python
5 Fri, Sep 20 Problem Set 2 available, due Friday, September 27
Quiz 3 available, due Friday, September 27
Read Py4E Ch 2
Python data types
       
6 Skipping week 6 😰    
       
7 Mon, Sep 30 Finish up Py4E Ch 2
Read Py4E Ch 3
Python: Flow Control
7 Wed, Oct 02 Read Py4E Ch 4 Python: Functions
7 Fri, Oct 04 Problem Set 3 available, due Friday, October 11
Read Py4E Ch 5
Python: Iteration
       
8 Mon, Oct 07 Python data typesRead Py4E Ch 6 & Ch 7 Python: try/except, Strings, File I/O
8 Wed, Oct 09 Read Ch 11 RegEx in Python
8 Fri, Oct 11 Problem Set 3 due
[Read Py4E Ch 12 & Ch 13
Scripting data acquisition
       
9 Mon, Oct 14 Problem Set 4 available, due Friday, October 25
SciPy, NumPy, Pandas
SciPy, NumPy, Pandas
9 Wed, Oct 16 Project 1 available, due Wednesday, October 30 Matt at Conference, no class: Work on Problem set 4
9 Fri, Oct 18   Homecoming, no class
       
10 Mon, Oct 21 Quiz 4 available, due Monday, October 28
Scan Py4E Ch 16
Py4E Ch 14: Object oriented Programming
Py4E Ch 14: Object oriented programming
10 Wed, Oct 23   Matplotlib and data visualization
Pandas with Messy Data
Data visualization with Pandas
10 Fri, Oct 25 Problem Set 4 due
Programming Foundations Databases
Work on Project 1
       
11 Mon, Oct 28 Quiz 4 due Database intro
Flight DB Example
11 Wed, Oct 30 Project 1 Due
Overview of databases
Database design
11 Fri, Nov 01 Read Py4E Ch. 15, through 15.5 and my notes
Problem set 5 available, due Friday, November 15
Py4E Ch. 15, through 15.5 and my notes
Databases, SQL and sqlite
       
12 Mon, Nov 04   Py4E Ch. 15, through 15.5 and my notes
Databases, SQL and sqlite
12 Wed, Nov 06 Quiz 5 available, due Wednesday, November 13 More on databases and Joins
12 Fri, Nov 08 Project 2 available, due Wednesday, December 04 SQLAlchemy
       
13 Mon, Nov 11   Veteran’s Day, no class
13 Wed, Nov 13   SQLAlchemy and Pandas
13 Fri, Nov 15 Problem Set 5 due Argparse
       
14 Mon, Nov 18   Graphics
14 Wed, Nov 20   Work on Project 2
14 Fri, Nov 22 Quiz 5 due
Quiz 6 available, due 2024-12-04
Intro to AI
       
15 Mon, Nov 25   Thanksgiving, no class
15 Wed, Nov 27   Thanksgiving, no class
15 Fri, Nov 29   Thanksgiving, no class
       
16 Mon, Dec 02   Into to AI
16 Wed, Dec 04 Project 2 due
Quiz 6 due
Intro to AI