Apache Cassandra

Apache Cassandra is a free, open-source project and a second-generation distributed NoSQL database and is considered to be the best choice for high availability and scalability databases, particularly when dealing with large amounts of data. Cassandra supports replication across multiple datacenters, while also making the write and read processes highly scalable by offering tunable consistency. This Apache Cassandra training course will provide you with an overview of the fundamentals of Big Data and NoSQL databases, an understanding of Cassandra and its features, architecture and data model, its role in the Hadoop Big Data ecosystem, and show you how to install, configure and monitor Cassandra.
The large volume and variety of data that today's businesses process require the need for a highly available, low latency database. Apache Cassandra provides this solution by permitting high-speed reads and writes across a replicated, distributed system. This Apache Cassandra training course provides data modeling experience to take advantage of the linearly scalable peer-to-peer design of Cassandra.
Delegates will learn how to
· Architect Cassandra databases and implement commonly used design patterns
· Model data in Cassandra based on query patterns
· Access Cassandra databases using CQL and Java
· Create a balance between read/write speed and data consistency
· Integrate Cassandra with Hadoop, Pig, and Hive
Audience
Professionals aspiring for a career in NoSQL databases and Cassandra
· Analytics professionals
· Research professionals
· IT developers
· Testers
· Project managers
Prerequisites
· Knowledge of databases and SQL
· Java programming
NoSQL Overview
Justifying non-relational data stores
Listing the categories of NoSQL Data Stores
Exploring Cassandra
Defining column family data stores
Surveying Cassandra
Dissecting the basic Cassandra architecture
Querying Cassandra
Defining Cassandra Query Language, CQL
Enumerating CQL data types
Manipulating data from the cqlsh interface
Leveraging Cassandra structures and types
Drawing comparisons with the relational model
Organizing data with keyspaces, tables and columns
Creating collections and counters
Modeling data based on queries
Designing tables around access patterns
Clustering with compound primary keys
Improving data distribution with composite partition Keys
Detailing tunable consistency
Identifying consistency levels
Selecting appropriate read and write consistency levels
Distinguishing consistency repair features
Balancing consistency and performance
Relating replication factor and consistency
Trading consistency for availability
Achieving linearizable consistency with Compare-And-Set
Working with Cassandra collection types
Grouping elements in sets
Ordering elements in lists
Expressing relationships with maps
Nesting collections
Storing data for easy retrieval
Mapping data to tuples and user defined types
Investigating the frozen keyword
Applying the Valueless Columns Pattern
Strategic implementation of clustering columns
Controlling data life span
Expiring temporal data with time-to-live
Reviewing how tombstones achieve distributed deletes
Executing DELETEs and UPDATEs in the future
Constructing materialized views and time series
Modeling time series data
Enhancing queries with materialized views
Materialized views maintained in the application
Driving analytics from materialized views
Managing triggers
Creating triggers by implementing ITrigger
Attaching triggers to tables
Supporting materialized views with triggers
Querying Cassandra data with the Datastax Java Driver
Connecting to a Cassandra cluster
Running CQL through the Java Driver
Batching prepared statements
Paginating large queries
Persisting Java Objects with Kundera
Defining the Java Persistence Architecture, JPA
Configuring Kundera to work with Cassandra
Generating schemas automatically
Managing JPA transactions in Kundera
Leveraging built-in Cassandra connectors
Loading data into Hadoop MapReduce with the Cassandra InputFormat
Utilizing the Cassandra Loader to create Pig relations
Converting a Cassandra table to a Hive table with the Casssandra serializer/deserializer (SerDe)
Program Details | |
Duration | 3 Days |
Capacity | Max 12 Persons |
Training Type | Classroom / Virtual Classroom |