Saturday, 8 April 2017

Unit 3: Relational Algebra



Relational Algebra



Relational Query Languages

Languages for describing queries on a relational database
Structured Query Language (SQL)
  • Predominant application-level query language
  • Declarative
Relational Algebra
  • Intermediate language used within DBMS
  • Procedural



What is an Algebra?

  • A language based on operators and a domain of values
  • Operators map values taken from the domain into other domain values
  • Hence, an expression involving operators and arguments produces a value in the domain
  • When the domain is a set of all relations (and the operators are as described later), we get the relational algebra
  • We refer to the expression as a query and the value produced as the query result



Relational Algebra

  • Domain: set of relations
  • Based on set theory
  • Contains extensions to manipulate tables
  • Functional language
  • Procedural, i.e., order to operations, algorithm implicit in the functional evaluation





Relational Algebra Operations

Below are fundamental operations that are "complete".  That is, this set of operations alone can define any retrieval.
  • Select
  • Project
  • Rename
  • Union
  • Set Difference
  • Cartesian Product
Convenient, natural additions to the set of operations makes
  • Set Intersection
  • Natural Join
  • Division
  • Assignment


Select Operation

Choose a subset of tuples from a relation based on some criteria, results in another relation called a "result set"
Notation uses lower case sigma:
σcondition(relation)
condition is a boolean expression in which rows are selected/kept/included where the condition is true
  • attribute <comparisonconstant
  • attribute1 <comparisonattribute2
  • <comparison> is the usual <, >, <=, >=, =, and <> (not equal)
  • andor, and not logical operations
  • substring operations and pattern matching




Examples

  • Get the Democrat Presidents
    σParty="Democrat"(PRESIDENT)
  • Get the living Democrat Presidents
    σParty="Democrat" Λ DeathDate is NULL(PRESIDENT)
IdNameAddressHobby
1123John123 Mainstamps
1123John123 Maincoins
5556Mary5 Lake Dr.hiking
9876Bart10Pine St.stamps
  • Get those persons whose hobby is stamps
  • σId>3000 OR Hobby=‘hiking’ (Person)
  • σ Id>3000 AND Id <3999 (Person)
  • σ NOT(Hobby=‘stamps’)(Person)
  • σHobby≠'hiking’ (Person)


Project Operation

  • Produce a subset of attributes from a relation
  • Unselected columns are eliminated
  • Duplicate rows are eliminated
  • Result is a relation
Notation is lower case pi
πattribute- list(relation)
Project examples
  • "List the political parties that have had Presidents"
  • "List the states that have had Presidents"
  • πName,Hobby(Person)
  • πName,Address(Person)


Expressions

Compose these operations into a sequence to represent a single query.
IdNameAddressHobby
1123John123 Mainstamps
1123John123 Maincoins
5556Mary5 Lake Dr.hiking
9876Bart10Pine St.stamps
πId,Name(σHobby= ‘stamps' OR Hobby='coins’(Person))
Result
IdName
1123John
9876Bart
"List the states that have living Presidents"






Set Operations

Union: r ∪ s ⇒ a row is in the result set if it is a row from r or from s; or the set of all tuples that belong to either r or s.
[Intersection: r ∩ s ⇒ a row is in the result set if it is a row from both r and from s.]
Set Difference: r - s ⇒ the set of all tuples in r that are not in s
Both operations require that the sets r and s are union compatible:
  • Both r and s have same number of columns
  • Domains of attributes match or are compatible in sequence in both r and s
  • The names may not necessarily match up; helpful if they do, however.
Union compatible relations can be combined using union, intersection ∩, and set difference.

Example:

The following tables are NOT union compatible:
  • Person(SSN, Name, Address, Hobby)
  • Professor (Id, Name, Office, Phone)
However,  πName (Person) and πName (Professor) ARE union compatible so that
πName (Person) πName (Professor) makes sense.



Cartesian Product

If R and S are two relations, R × S is the set of all concatenated tuples <x,y>, where x is a tuple in R and y is a tuple in S
  • (R and S need not be union compatible)
R × S is expensive to compute:
  • The sum of the number of attributes of R and S is the number of attributes of the Cartesian product
  • The product of the sizes of R and S in the number of rows
Cartesian Product
R S X S
abcdabcd
xl
x2
y1
y2
x1
x2
y1
y2
x3
x4
y3
y4
x1
x2
y3
y4
x3
x4
y1
y2
x3
x4
y3
y4


Renaming

Result of an expression evaluation is a relation, called the result set.
Attributes of relation must have distinct names. This is not guaranteed with Cartesian product. Suppose in the previous example attributes a and c have the same name.
Renaming operator tidies this up. To assign the names A1, A2,… An to the attributes of the n column relation produced by
expression expr, use the form expr [A1, A2, … An]

Example

  • Transcript ( StudId, CrsCode, Semester, Grade)
  • Teaching (ProfId, CrsCode, Semester)
πStudId, CrsCode (Transcript) [StudId, CrsCode1] ×  πProfId, CrsCode (Teaching) [ProfId, CrsCode2]




Join

This is a derived operation, i.e., it is based on the basic operations of the relational algebra.  It is a convenience operation because it is done so much.
A (general or theta θ) join of R and S is the expression
 join-condition S
where join-condition is a conjunction of terms Ai oper Bi in which Ai is an attribute of R and Bi is an attribute of S and oper is one of =, <, >, ≤, ≥, ≠
The meaning is essentially:
σjoin.-condition(R X S)
where join-condition in both case are the same, except for possible renamings of attributes.

Join and Renaming

Problem: R and S might have attributes with the same name – in which case the Cartesian product is not defined
Solution:
  • Rename attributes prior to forming the product and use new names in join-condition´.
  • Common attribute names are qualified with relation names in the result of the join


Theta Join – example

Output the names of all employees that earn more than their managers
πEmployee Name(Employee|X|MgrId=Id AND Salary>Salary Manager)
the join yields a table with attributes: Employee.Name, Employee.Id, Employee.Salary, Employee.MngrId, Manager.Name, Manager.Id, Manager.Salary

Equijoin Join - Example

Equijoin: Join condition is a conjunction of equalities
πName, CrsCode (Student|X|Id=StudId AND Grade = `A' Transcript )
Equijoin
Student
 
Transcript
IdNameAddressStatusStudIdCrsCodeSemGrade
111John  111CS305F03B
222Mary  222CS306F03A
333Bill  333CS304S04A
444Joe      
Result
MaryCS306
BillCS304



Natural Join

Special, but most useful, case of equijoin:
  • the join condition equates all but only those attributes with the same name
  • the condition doesn’t have to be explicitly stated!
  • duplicate columns are eliminated from the result

Transcript (StudId, CrsCodeSem, Grade )
Teaching (ProfId, CrsCodeSem)
Transcript|X|Teaching =
StudId, Transcript.CrsCode Transcript.Sem, Grade, ProfId (Transcript|X|CrsCode=CrsCode AND Sem = Sem Teaching ))
[StudId,CrsCode, Sem, Grade, ProfId]
More generally:
R|X|S = πattr-list (σjoin.-condition R|X|S )
where attr-list = attributes (R) ∪ attributes(S)
(duplicates are eliminated )and join-condition has the form: A1=A1 and ...and An=An where {A1 ... An} = attributes (R) ∩ attributes(S)


Natural Join Example

List all ID's of students who took at least two different courses:
πStudId (σCrsCode≠CrsCode2 (Transcript|X|Transcript [StudId, CrsCode2, Sem2, Grade2]) )



Division

Goal: Produce the tuples in one relation, r, that match all tuples in another relation, s
  • r (A1, …An, B1, …Bm)
  • s (B1 …Bm)
  • r/s, with attributes A1, …An, is the set of all tuples <a> such that for every tuple <b> in s, <a,b> is in r
Can be expressed in terms of projection, set difference, and cross-product





Division - Example

List the Ids of students who have passed all courses that were taught in spring 2000
Numerator:
  • StudId and CrsCode for every course passed by every student:
  • π   StudId, CrsCode (  σGrade< > ‘F’ (Transcript) )
Denominator:
  • CrsCode of all courses taught in spring 2000
  • π CrsCode ( σ Semester=‘S2000’ (Teaching) )
Result is numerator/denominator



No comments:

Post a Comment

GATE 2018 NOTES

                    COMPUTER NETWORK : PROF.SUBODH R.NIKHALE ...

Amazing idea