Course Curriculum

Explores database administration examining the RDBMS engine, studies advanced techniques for managing traditional data: tune and optimize performance, maximize throughput, and design fault tolerant systems. Provides hands-on experience with the Oracle DBMS.

Advanced Database Management Systems (ITEC 541)

• Relational Constructs of Data Manipulation
    a. Review of Conceptual Underpinnings of Relational Databases with emphasis on data independence and its impact on query languages
    b. The Relational Algebra
    c. Advanced SQL
    d. Implementation of retrieval language constructs
• Physical Database Implementations
    a. Storage and File Structures
    b. Tuning, indexing, and hashing of queries
    c. Query Processing with emphasis on query optimization
    d. Enterprise Database Tuning Opportunities
• Advanced Logical Design Issues
    a. Advanced Constraints, Types, and Assertions
    b. Concurrency and Client/Server Systems, Transactions, Transaction Isolation Levels
    c. Temporal Databases and flashback
    d. Missing Information
    e. Object Relational Databases
    f. Large Objects (LOBs)
• Issues in Database Security
    a. User Accounts, Roles, Profiles, and Privileges
    b. Authentication
    c. SQL Injection, Inference, and other common attacks
    d. Data and Password Encryption, Password Policies

 

Students who complete this course will be able to:
• Describe the key attributes of a data retrieval language.  Demonstrate proficiency with the relational algebra or other mathematically based retrieval language.
• Describe and apply basic concepts of file organization including the properties and architecture of physical devices such as disk drives.
• Describe and compare methods for efficient data retrieval of persistent data including indexes, hashing, and sequential access.
• Describe and explain the steps in query processing, evaluate execution plans.
• Implement operations/algorithms from the relational algebra or other retrieval language.
• Explain the purpose of query optimization, recognize opportunities for optimization, draw and optimize expression trees.
• Perform tuning tasks on an enterprise level DBMS.
• Construct appropriate designs for databases that present significant temporal, null value, or other complexities.
• Explain the ACID properties of transaction control. Implement transactions with those properties in stored procedures. Implement triggers for complex constraints.
• Describe and use current extensions of relational database technology such as object-relational or XML extensions.
• Explain theoretical and practical uses and limitations of nested tables, arrays, and user-defined types in relational databases.
• Explain options for how large objects (video clips, pictures, documents, etc) are stored and retrieved from a database and the advantages and disadvantages of each.
• Implement a database application that uses large objects.
• Describe fundamental challenges associated with database security and associate and describe solutions to those challenges.
• Analyze and manage typical privilege systems for database systems.
• Employ data encryption techniques on an RDBMS.
• Implement Password and/or other authentication policies on an RDBMS.

Studies how organizations monitor and analyze their business. Studies traditional and big data techniques for managing and analyzing large data sets. Studies the ETL process, machine learning algorithms, and data visualization. Provides hands-on experience with the Oracle DBMS and Hadoop, Pig, and Hive.

Data Warehousing, Mining, and Reporting (ITEC 542)

• Introduction to business intelligence
• Data Warehousing
     a. Dimensional modeling
    b. Warehouse aggregates
    c. Data quality
    d. Extract, transform, and load (ETL) process
    e. Physical design
     f. Data warehousing lifecycle
• Reporting and data analysis
    a. Online analytical processing (OLAP)
    b. Commercial query and reporting tools
• Data mining
    a. Data mining methodology
    b. Statistical methods
    c. Decision trees
    d. Association rules
    e. Clustering
    f. Neural networks
    g. Data preparation

 

Students who complete this course will be able to:
• Design and develop a Star schema and describe best practices for dimensional modeling.
• Design and develop a basic ETL process and explain the challenges of the ETL process.
• Identify and develop valuable aggregates for a given problem.
• Design and develop different types of reports and reporting requirements.
• Describe the limitations of SQL with respect to analytical reports.
• Describe common data mining tasks.
• Describe data mining techniques and implement at least one technique.
• Explain the value of transactional data with respect to business intelligence.
• Explain the importance of data quality and the challenges of producing high quality data.

Investigates techniques for managing massive volumes of data and studies the design of scalable systems, on-demand computing, and cloud computing. Provides hands-on experience with Hadoop and NoSQL databases.

Distributed Database Systems (ITEC 641)

1) Introduction to Distributed Databases
        a. Need for Distributed Databases
        b. Challenges associated with Distributed Databases
        c. Types of Distributed Databases/DD Architectures.
    2) Supporting Concepts in Computer Networks
        a. Networking Overview
        b. Network Topologies
        c. The OSI Model
        d. Common Protocols
        e. The Internet and the Domain Name System
    3) Designing Distributed Databases
        a. Vertical and Horizontal Fragmentation
        b. Data Replication and Replication Models
        c. Designs for Semi-structured and voluminous data
    4) Distributed DBMS
        a. Distributed DBMS Architectures
        b. Distributed Transaction Management
        c. Distributed Concurrency Control
        d. Distributed Query Processing
        e. Distributed DBMS Security and Meta Data Management
    5) Issues of Scale
        a. Introduction to Scalability
        b. noSQL databases
        c. Streams

 

At the end of the class, students must be able to:
    1) Describe and apply general principles and concepts of distributed computing and distributed
        computing networks.
    2) Design and implement distributed databases
    3) Compare and contrast consolidated and distributed query processing and concurrency control.
    4) Design efficient distributed transactions
    5) Describe distributed database management reliability
    6) Describe noSQL solutions to voluminous semi-structured data.
    7) Identify and describe the advantages and challenges associated with data streams.

Examines advanced techniques for tuning and optimizing performance. Studies load balancing, clustering, mainframe systems, and other methods of managing traditional data and big data. 

Database Performance and Scalability (ITEC 643)

1) Basic Database Tuning
        a. Review of Indexing and Hashing Schemes
        b. Tuning SQL
        c. Tuning Memory and Storage Structures and Parameters
        d. Tuning Network Communication
    2) Load Testing and Load Balancing
        a. Methods for Load Balancing
        b. Methods for Load Testing
    3) Virtualization and Cloud Architectures
        a. Purpose of Virtualization
        b. VMware Details
        c. Cloud Architectures
    4) Big Data
        a. Defined
        b. Data at Rest vs. Streams
        c. Current Tools

 

Students who complete this course will be able to:
    1) Describe and apply techniques for tuning database systems.
    2) Design and apply techniques for load testing and load balancing distributed database systems.
    3) Identify and describe the advantages and challenges associated with virtualization and cloud
        computing for database systems.
    4) Design and assess Big Data architectures and their performance.
    5) Describe and apply current techniques for Big Data Storage.
    6) Describe the advantages and limitations of noSQL solutions to distributed data.

Studies reliability, security, and privacy issues related to storing, transmitting, and processing large data sets. Studies techniques to secure databases and system infrastructure and methods to assure data integrity through fault tolerance and data recovery.

Information Security and Assurance (ITEC 645)

1) Fundamentals of information security and privacy
        a. Goals of security (confidentiality, integrity, availability, authentication, non-repudiation and
           accountability)
        b. Vulnerabilities and exploits on DBMS and data sets (e.g., Programming flaws, SQL injection,
             statistical inference attacks)
        c. Threat modeling and security analysis
    2) Information Security with data storage and management
        a. Cryptography (symmetric key, asymmetric key, secure hashes and modes of operation)
        b. Secure design principles (e.g., least privilege, complete mediation, separation of privilege,
           least common mechanism, defense in depth)
        c. Authentication
        d. Access control
        e. Access logs
        f. Security mechanisms (e.g., perimeter security, host based security)
        g. Secure operations (backups, hardening distributed databases, disaster recovery, business
           continuity)
    3) Privacy
        a. Statistical inference attacks and controls
        b. Legal issues (e.g. HIPAA, FERPA, ECPA)
    4) Reliability
        a. Failures
        b. Fault tolerance

 

Students who complete this course will be able to:
    1) Enumerate the main goals of security and privacy including confidentiality, integrity,
       availability, authentication, non-repudiation and accountability.
    2) Analyze and develop threat models for the security of database management systems,
       networks and distributed database infrastructures.  
    3) Analyze and develop threat models on the privacy of data (such as inference attacks).
    4) Perform security analysis on centralized and distributed database installations using techniques
       such as the Open Source Security Testing Methodology (OSSTMM).
    5) Describe and apply cryptographic algorithms, and mechanisms including secure hashes, secret
       key and public key cryptography, and their modes of operation to secure both stored data and data
       in transit across networks.
    6) Describe and apply standard secure design principles including least privilege, complete
       mediation, least common mechanism, economy of mechanism, defense in depth, reluctance to trust
       and privacy to the different database installations.
    7) Describe and deploy authentication, fine-grained access control and accountability mechanisms
       (such as access logs) on database management systems and distributed and centralized database
       installations.
    8) Describe and deploy mechanisms that provide security such as intrusion detection systems and privacy such as those that protect against         statistical inference attacks on databases.
    9) Perform secure operations including backup, recovery and secure updates.
    10) Administer security by enumerating the steps of risk management and developing security
        policies and plans such as acceptable usage policies, and business continuity and disaster recovery
        plan.
    11) Enumerate and identify privacy issues of data taking into account the federal and state laws
        that govern privacy such as HIPAA, FERPA, and the Electronic Communication and Privacy Act.
    12) Describe reliability mechanisms to achieve fault tolerance in distributed databases.

Investigates comprehensive, enterprise-wide approaches to organize, protect, and control trusted information assets. Studies techniques to govern, control, and protect data on-site and off-site including master data management, data quality, data integration, and cloud computing architectures.

Enterprise Information Architecture (ITEC 647)

1) Information architecture
    2) Information governance
    3) Master data management
    4) Information quality
    5) Data integration
    6) Metadata management

 

Students who complete this course will be able to:
    1) Explain the importance of data governance.
    2) Develop policies for protecting and securing data and information assets.
    3) Design and develop a system to protect and secure data and information.
    4) Develop a program that integrates data from multiple sources.
    5) Profile data elements.
    6) Analyze the quality of an individual data element.
    7) Analyze the overall quality of a data source.
    8) Design and develop a system to capture and manage metadata.

Explores data structures and algorithms for storing and processing traditional data and big data. Provides hands-on experience with Spark and Scala.

Data Structures for DBMS (ITEC 660)

1) Analysis of algorithms
        a. Time and space
        b. Amortized analysis
        c. I/O bottlenecks
    2) Memory hierarchy
        a. Caching
        b. External memory organization (disk organization)
    3) Sorting and searching algorithms (counting sort)
    4) External memory and cache-oblivious data structures and algorithms (e.g., types of B-trees)
    5) Hashing
    6) Algorithms that exploit temporal and spatial locality
    7) Succinct data structures (rank, tries, suffix arrays) to store data compactly.
    8) Advanced topics, such as
        a. Data compression
        b. Pattern matching
        c. Search engine indexing
        d. NP completeness

 

Students who complete this course will be able to:
    1) Compare and contrast temporal and spatial efficiency of algorithms and data structures used
        to store, query and process medium to large data sets.
    2) Describe and analyze the performance issues of the different memory organizations used to
        store large data sets.
    3) Describe and apply data structures and algorithms that achieve efficiencies in query and
        processing times of medium to large data sets.
    4) Describe and apply data structures and algorithms that store data compactly.
    5) Describe current algorithms and data structures used to store, query and analyze medium to
        large data sets.

Studies techniques for analyzing structured, unstructured, and semi-structured data at rest and in motion. Studies non-traditional data sources including social media, mobile devices, and sensors; emerging analytical applications; real-time processing of data streams; and massively parallel processing technology. 

Information Analytics (ITEC 685)

1) Databases and their evolution
    2) Big data technology, no-SQL
    3) AI techniques
    4) Logic Rule, Uncertainty
    5) Bayes rule, Naïve Bayes, Bayesian Network
    6) Sentiment analysis
    7) Association rule mining
    8) Learning latent model, Machine learning
    9) Cluster, classification
    10) Linear and logistic regression
    11) Least square, optimization
    12) Non-linear model, Neural Network
    13) Dimensionality reduction
    14) Anomaly detection
    15) Recommend system
    16) Parallel computing, Map Reduce
    17) Analytics tools

 

Students who complete this course will be able to:
    1) Categorize data into groups based on attributes.
    2) Classify information based on existing data.
    3) Identify the relationship between elements of a decision.
    4) Understand optimization, maximizing certain outcomes while minimizing others.
    5) Develop decision logic or rules that will produce the desired action.
    6) Predict an event in the future effectively based on certain model
    7) Seek out subtle data patterns to answer questions about customer performance, such as fraud
        detection models.
    8) Understand requirements in using big data analytics.
    9) Simulate human behavior or reaction to given stimuli or scenarios.

Provides students in the Data and Information Management program an opportunity to conduct research or project on data and information management field under the direction of ITEC faculty members.  Results of the applied project will be formally presented at the end of the final semester.

Practicum in Data and Information Management: Capstone Project (ITEC 695)

Each section of ITEC 695 will have one instructor. Each student will have a project advisor. Each student will design, implement, and test a substantial component of an information system. Multiple students may work together to develop an information system involving multiple components.