SEIS 744 - IoT with Machine Learning

This course covers the technical concepts of managing vast amount of unstructured, semi-structured and structured data, collectively called "Big Data". Due to the sheer volume of Big Data, traditional approaches to managing databases does not work well for Big data and does not perform as expected. A distributed architecture for both the file system and the operating system is needed. Some of the techniques used in managing Big Data have the origins in the research and the developments that have been going on for decades in the area of parallel processing and distributed database management systems. This course focuses on why big data sets must be distributed and the issues that distribution introduces. The basic concepts on which distributed data sets are handled are discussed first. Once a foundation is defined, software tools that we use to work with big data sets are studied to provide an in-depth analysis of the concepts introduced. Specifically, we will study the issues distributed data design, data fragmentation, data replication, distributed fault tolerance/recovery. We will use various tools in dealing with big data sets and use real life examples of how these open source software are used.
Fee Assessment CSIS, Long form IDEA evaluation, Software Technical Elective
