Introduction to SQL and the basic SELECT statement (Part 2)

Original Post Date: 01 Feb 2019

Preface

So, I’m taking a course from Stanford’s online Lagunita titled Databases, the self-paced series. the Databases course is comprised of 14 “mini-courses.”

Personally, I’m following their Relational Databases and SQL: Foundations and Practice pathway, which includes:

  • Introduction and Relational Databases
  • Relational Algebra
  • SQL (the topic of this post)
  • Relational Design Theory
  • Unified Modeling Language (UML)
  • Indexes and Transactions
  • Constraints and Triggers
  • Views and Authorization
  • On-line Analytical Processing
  • Recursion in SQL

I’m going to give a bunch of bullet points and notes about what I’ve learned. Here we go!


SQL: Intro

  • “Sequel” is the “in” pronunciation
  • Supported by all major commercial database systems
  • Standardized and new features added all the time
  • Interactive via GUI or prompt, or embedded in programs
  • Declarative, based on relational algebra
    • Declarative means you write simple queries that say exactly what you want to get out of the database; you don’t have to describe how to get the data out
    • Leads to the “query optimizer,” which figures out the fastest way to execute the query on the database

Two parts to the language:

  • Data Definition Language (DDL)
    • create table... drop table... etc.
  • Data Manipulation Language (DML)
    • select insert delete update etc.
  • Other
    • indexes, constraints, views, triggers, transactions, authorization, etc.

The Basic SELECT Statement

Select A1,A2,...,An
From R1,R2,...,Rm
Where condition
  • Select: What to return (3)
  • From: Relations (1)
  • Where: Combine filter (2)

This course uses three relations for its examples:

  1. College(cName,state,enrollment)
  2. Student(sID,sName,GPA,sizeHS)
  3. Apply(sID,cName,major,decision)

Examples

Basic

select sID, sName, GPA
from Student
where GPA > 3.6;
  • Gives a table of all students with a GPA greater than 3.6
  • Don’t need GPA in the select clause even if we’re filtering on the GPA in the where clause

Natural join, select distinct

select sName, major
from Student, Apply
where Student.sID = Apply.sID;
  • Same as natural join
  • In SQL, we get duplicates (mult-set model)…
  • Replacing select with select distinct removes duplicates!

Filtering in where

select sName, GPA, decision
from Student, Apply
where Student.sID = Apply.sID
    and sizeHS < 1000 and major = 'CS' and cName = 'Stanford';
  • Filters results based on sizeHS, major, and college applied to

Avoid ambiguous column names

select distinct College.cName
from College, Apply
where College.cName = Apply.cName
    and enrollment > 20000 and major = 'CS';
  • Must specify cName in select is College.cName

order by

select Student.sID, sName, GPA, Apply.cName, enrollment
from Student, College, Apply
where Apply.sID = Student.sID and Apply.cName = College.cName;
order by GPA desc, enrollment;
  • SQL is an unordered model
  • To sort the results, use order by (desc). asc is the default.
  • Can order by multiple attributes

like and select *

select sID, major
from Apply
where major like '%bio%';
  • like utilizes %, _ for pattern matching
  • select * (select star) returns all attributes

Arithmetic and as

select sID, sName, GPA, sizeHS, GPA*(sizeHS/1000.0) as scaledGPA
from Student;
  • You can do arithmetic within SQL clauses
  • as is used to change the labeling of the schema in a query result