Cloud Development
Attachment | Size |
---|---|
EXAM2 - Review notes.docx | 1.94 MB |
EXAM2 - Review
Docker - 22 Questions (11 TF, 11 MC)
Google Cloud Platform (GCP) - 16 Questions (6 T/F | 10 MC)
Amazon Web Services (AWS) - 12 Questions (7 T/F, 5 MC)
Docker
- 22 Questions (11 TF, 11 MC)
History Timeline
virtual servers came first
Docker has server daemon, server run
Docker has a layer and you can run
Difference between Docker and virtual machines,
Virtual machines have a larger OS
Its own namespace Process, Filesystem, Network
Docker can only be hosted on two OS: Windows 2016 Server and Linux
Eliminating App conflicts, Docker allows different versions of frame works
docker give namespace isolation
docker file is an image
Images are read only, can modify and save as a new image
Images are made up of multiple layers that can be shared
Terminolgy question, if on linux you don't need a base OS, the host OS is running on the bare metal, or VM, but the base OS is what is in the docker image. Base OS is not required on Linux.
interactive mode, did not just run it and stop, allowed us to enter commands
What is the container? Running image, can be started, moved, stopped and deleted
We launched an image installed, mysql and saved a new image
-
1990s One Application on One Physical Server
-
WebSphere, WebLogic
-
Slow deployment time, wasted resources
-
-
2000 Virtual Machines (abstraction of hardware)
-
Better resource utilization
-
Each VM requires CPU, storage, RAM allocations, full guest OS
-
-
2013 Containers (abstraction of operating system)
-
Standardized packaging for software and dependencies
-
Share the same OS kernel
-
What is Docker?
-
Founded in 2013 as Linux developer tool
-
Announced on AWS in Nov 2014
-
Is the leading software container platform
-
Solves the "works on my machine" problem
-
Transforms app and infrastructure security, portability, agility and efficiency (on test, Docker has security, portability, agility, can have its own version of depenancies)
-
Helps developers get along with IT Ops
-
Hypervisors virtualize the hardware for VMs
-
Docker virtualizes the operating system layer for containers
-
The containerized OS resources start up very fast
-
The hosts controls resource utilization by each container
-
Many OS files, directories and running services are shared between containers and projected into each container’s namespace
-
The host gives each container a virtualized namespace of resources
-
-
Instant startup, reliable execution and resource governance make containers great for development
Docker helps bridge the gap between microservices/cloud development and traditional apps.
Containers provide flexibility to integrate the development and running of different types of apps.
Not a Virtual Machine OR a container decision. Both technologies can work together.
Comparing Docker Containers and Virtual Machines
-
With a VM, you have a copy of the OS for every VM
-
VM images are typically multiple GB in size
-
Docker containers sit on a single instance of the OS
-
Docker apps can run across multiple containers
-
Containers start fast
-
Usually run 6-8 times as many containers as VMs
Docker Benefits for Web Developers
Accelerate Developer On-boarding
-
Setting up the environment, especially for those working remotely can be challenging
-
Different users have different security clearance
Eliminate App/Version Conflicts
-
If you want to move from one version of a framework to another, it may be difficult if your app is running with other apps that depend on the older framework
-
Docker's isolated containers allow different apps with different frameworks to run on the same machine
-
Docker uses namespace isolation
-
Each host gets a virtualized namespace of the resources it can see
-
Environment Consistency
-
When consulting, you often have to rely on environments setup by other companies
-
Docker containers give you more control
Where does Docker Run Natively?
-
Runs on Linux or Windows 2016 Servers (on test)
-
Docker has its roots in Linux (Linux Container Support LXC)
-
Windows Server 2016 has built in support as well
Where else does Docker run?
-
Older versions of Windows run using Docker Toolbox, which essentially runs docker inside Virtual Box.
-
Windows 10 Pro - Docker Community edition runs on top of Hyper-V
-
Mac - Docker Community edition runs on Hyperkit
The Role of Images and Containers
-
Docker core terminology: "images" and "containers"
-
An image is something that is used to build a container
-
The image is the blueprint used to get a running container
-
Containers are where the live applications run
What is a Docker Image?
-
A read-only template composed of layered filesystems
-
Images are made up of multiple layers
-
A layer can also be just another image
-
Each image contains software you want to run
-
Every image contains a base layer
-
Layers are read only
-
Writable areas can be shared with layers
Shared Layers
-
Images can share layers in order to speed up transfer times and optimize disk and memory usage
-
Parent images that already exists on the host do not have to be downloaded
Base Images/Layers
-
Host OS - The OS running on bare metal or VM
-
Base OS - refers to an docker image that contains some os functionality such as Ubuntu, CentOS, or windowsservercore
-
Access to resources like networking interfaces and disk drives is virtualized inside docker environment using the base OS
-
The base image is the basic parent image on which you add layers
-
If Host OS == Linux, Base OS could be centos, busybox, alpine, other Linux flavors, or even scratch (NO Base OS)
-
If Host OS == Windows Server, Base OS could be nanoserver or windowsserver core
What is a Docker Container?
-
Standard unit for app service
-
Created from an image
-
Can be run, started, stopped, moved, and deleted
-
Starts and stops very fast
-
Easy to get it on and off the ship
-
The ship, in our case could be any of the following
-
The development environment
-
The staging environment
-
The production environment
-
Container Management (no questions on test)
-
When you deploy hundreds and thousands of containers, tracking and management becomes an issue
-
Management Solutions
-
Docker Compose - allows you to define multi-container apps
-
Docker Swarm - manages containers across multiple hosts
-
Mesos - older solutions that recently added Docker support
-
Kubernetes - built by Google organizes containers into pods
-
Deis - an open source PaaS platform
-
Docker Commands
-
Docker runs processes in isolated containers
-
The container process has its own file system, its own networking, and its own isolated process tree separate from the host
The docker run command must specify an IMAGE
$ docker run [OPTIONS] IMAGE[:TAG|@DIGEST] [COMMAND] [ARG…]
[OPTIONS]
-
With $ docker run [OPTIONS] an operator can add to or override the image defaults set by a developer or the docker runtime
-
An image developer can define the following image [OPTIONS]
-
detached or foreground running
-
docker run -d or docker run -d=true
-
-
container identification
-
docker run --name my-redis redis
-
-
network settings
-
runtime constraints on CPU and memory
-
Which of these are things that can be changed in the options (detached or foreground, name, network none)
The person that runs the command can put resource contraits on the container when the image is run. (ram, processor, network)
Running an image detached vs foreground
-
"Detached" mode runs in the background as opposed to the default foreground mode
-
Containers started in detached mode exit when the root process used to start the container exits
-
Do not pass a service x start command to a detached container as it will be shutdown after the start succeeds
-
Foreground mode starts a container and attaches the console to the process's standard input, output, and standard error or even pretend to be a TTY
Container Identification
-
An operator can identify a container in three ways
-
UUID identifier come from the Docker daemon (random string)
-
UUID short identifier (first part of random string)
-
A name is defined at runtime and can be a meaningful way to identify a container
-
Container Network Settings
-
By default, all containers have networking enabled and they can make any outgoing connections
-
You could, however, disable networking
$ docker run --network none
-
There are various network options
Container Runtime Constraints
The operator can adjust performance parameters of the container:
$ docker run -it -m 300M --memory-swap -1 ubuntu:14.04 /bin/bash
Set memory limit to 300M, disable swap memory limits:
$ docker run -it --cpu-period=50000 --cpu-quota=25000 ubuntu:14.04 /bin/bash
If there is 1 CPU, this means the container can get 50% CPU worth of run-time every 50ms.
The default CPU CFS (Completely Fair Scheduler) period is 100ms
IMAGE[:TAG|@DIGEST]
-
IMAGE[:tag] - a way to specify a version of an image
$ docker run ubuntu:14.04
-
IMAGE[@digest] - another way to identify an image
$ docker run alpine@sha256:9cacb71397b640eca97488cf08582ae4e4068513101088e9f96c9814bfda95e0
Google Cloud Platform (GCP)
(16 - 6 T/F | 10 MC)
Trade off between flexibility to run what you want and what you have to manage
Cloud Service Levels/Pillars (Amazon calls them computing models)
-
Infrastructure as a Service (IaaS)
-
Highest level of flexibility and management control over IT resources
-
Most familiar with existing IT resources
-
Google Compute Engine, Google Cloud Storage, AWS EC2 & S3, Rackspace
-
-
Hybrid Infrastructure/Platform as a Service
-
Google BigQuery, Cloud SQL, Pub/Sub, AWS RDS, Azure SQL Database
-
-
Platform as a Service (PaaS)
-
Org no longer needs to manage the underlying infrastructure (usually hardware and operating systems) and can focus on management of their applications
-
Google App Engine, Cloud Functions, Heroku, Engine Yard, AWS Lambda
-
-
Software as a Service (SaaS)
-
Provides you with a completed product that is run and managed by the service provider
-
G Suite, Hotmail, Salesforce, Netsuite
-
Often are referring to end-user apps, i.e. email program
-
Summary of WHO takes responsibility under each level
Serverless (Platform as a Service: Cloud Functions, AWS Lambda)
-
What is serverless?
-
Misnomer: of course there is a server somewhere
-
You just don't have to worry about it
-
Forbes (May 2018): "Serverless …[helps] developers focus on writing code without having to worry about infrastructure...servers (physical & virtual) completely abstracted away from the user. [Developers] … focused on solving business problems (e.g., faster app deployment)"
-
-
Why serverless?
-
Per some analysts, fastest growing segment of cloud
-
1.9B (2016) and 4.25B (2018) => 7.7B(2021) and 14.93B (2023)
-
-
What if you go viral? - There is Autoscaling
-
What if you don't? - if your code is not running, you are not paying (No VMs to shutdown)
-
Google Cloud Essentials
-
Google Cloud Console Services are on the left menu
-
When added to projects, Services become Resources
-
GCP Resources are the components that make up the Google Cloud Projects
-
Resources are organized hierarchically using projects and folders
Organizing Cloud Resources
-
Projects allow you to group resources together
-
All resources must belong to exactly one project
-
Optionally project can belong to organizations
-
provide central visibility across projects
-
provide central control across projects
-
-
Departments and teams can be organized into folders
-
Folders can contain projects and subfolders
-
-
When a parent is deleted, all its resources are deleted
-
Allows for access-control policy inheritance
GCP Dashboard
-
Provides a high-level overview of the selected GCP project
-
Highlights key metrics and billing
-
The dashboard can be customized
-
Any card can be hidden, shown, or reordered on the page
-
-
You can manage one of hundreds of APIs
-
Enable and disable services
-
Generate or revoke credentials
-
Monitor requests
GCP G-Suite
-
Software as a Service
GCP App Engine
-
With App Engine, you just upload the code and Google handles the rest
Flexible Runtime Languages
-
Python 2.7 & 3.6
-
Node.js
-
Ruby
-
Java 8 / Servlet 3.1
-
Jetty 9
-
PHP 5.6, 7
-
Go 1.8, 1.9, 1.10
-
C#/.NET
In the App Engine Flexible environment, the App is launched inside Docker running inside a VM.
GCP Compute Engine
-
Delivers configurable virtual machines of all shapes and sizes
-
From "micro" to
-
160 vCPUs
-
3.75 TB of RAM
-
64 TB HDD or SSD
-
-
Debian, CentOS, CoreOS, SUSE, Red Hat Enterprise Linux, Ubuntu, FreeBSD
-
Windows Server 2008 R2, 2012 R2, 2016
GCP Cloud SQL
-
SQL servers in the cloud
-
High-performance, fully managed
-
600MB to 416GB RAM
-
Up to 64 vCPUs
-
Up to 10 TB storage; 40,000 IOPS
-
-
Types
-
MySQL
-
Postgress
-
Cloud SQL runs outside the App Engine so that it can be reached by third-party applications or other applications.
GCP Authentication
We add authentication (under GCP APIs and Services) to applications by using the Google Identity Platform. We will be able to access information about the site users while allowing Google to safely manage their credentials. OAuth 2.0 will make it possible for the site to accept Google accounts for authentication and allow the site to access basic profile information about the users.
A web app client ID and client secret will allow our app to authorize users and access Google APIs on their behalf.
The diagram to the right shows how the application's structure connects to each other. Users will login with their Google Account and their basic user data is stored in the session.
GCP Cloud Storage
A bucket is the basic container that holds data in Cloud Storage.
GCP PubSub
It is common for web application to need to perform background processing outside the context of a web request. The application can enqueue a job/task that is completed by a worker process.
GCP Data Storage Options
-
Cloud Storage, Cloud Firebase Storage
-
Unstructured data
-
Object storage solution
-
Binary Data
-
-
Cloud Spanner, Cloud SQL
-
Structured data
-
Non analytic data
-
Relational data
-
Use Cloud SQL if you DON'T need horizontal scalability
-
Use Cloud Spanner if you need horizontal scalability
-
-
Firebase Realtime DB, Cloud Datastore
-
Structured data
-
Non analytic data
-
Non-relational data
-
Use the realtime DB for pushing realtime updates
-
Use firestore for a JSON database
-
Firestore is the latest version of Cloud datastore
-
-
Big Table, Big Query
-
The right table for analytics/reporting data
-
Big Table is for high-throughput and low latency workloads
-
BigTable integrates with Hadoop and Spark
-
BigTable supports HBase API
-
BigQuery is not for frequent updates
-
BigQuery is more like a data warehouse
-
GCP BigQuery
-
A fast, highly scalable, cost-effective, and fully managed data warehouse
-
Built-in machine learning
-
Issue SQL queries across multi-terabytes of data
Amazon Web Services (AWS) - 12 Questions (7 T/F, 5 MC)
Amazon as of Q4 of 2018 has 32.3% of cloud market share
Microsoft Azure as of Q4 of 2018 has 16.5% of cloud market share.
Google Cloud as of Q4 of 2018 has 9.5% market share.
AWS
-
In 2006, Amazon Web Services (AWS) began offering IT infrastructure services to businesses as web services
-
Now serving over a million active customers in more than 190 countries
-
Replaces upfront capital infrastructure expenses low variable costs that scale with your business
-
If needed, businesses can instantly spin up thousands of services in minutes and deliver results faster
What is Cloud Computing according to Amazon
-
The on-demand delivery of compute power, database storage, applications, and other IT resources through a cloud services platform via the Internet with pay-as-you-go pricing
-
Access as many resources as you need
-
Pay for what you use
Six Advantages of AWS Cloud Computing
-
Trade capital expense for variable expense
-
Benefit from massive economies of scale
-
Stop guessing capacity
-
Eliminate guessing on your infrastructure capacity needs
-
-
Increase speed and agility
-
New IT resources are a click away
-
-
Stop spending money running and maintaining data centers
-
Focus on projects that differentiate your business
-
-
Go global in minutes
Cloud Computing Deployment Models
-
Cloud
-
fully deployed in the cloud and all parts of the app run in the cloud
-
can utilize low-level infrastructure of higher-level services
-
-
Hybrid
-
connects infrastructure and applications between cloud-based resources and existing non-cloud resources
-
-
On-premises
-
deploying resources on-premises using virtualization and resource management tools, sometimes called the "private cloud"
-
Goal of improving resource utilization through virtualization
-
Amazon Cloud Categories
-
Amazon Analytics products
-
Amazon Application Integration
-
Amazon Augmented Reality (AR) and Virtual Reality (VR)
-
Amazon Cost Management
-
Amazon Blockchain
-
Amazon Business Applications
-
Amazon Compute (EC2, Elastic beanstalk)
-
Amazon Customer Engagement
-
Amazon Database (RDS)
-
Amazon Desktop and App Streaming
-
Amazon Developer Tools
-
Amazon Game Tech
-
Amazon IoT
-
Amazon Machine Learning
-
Amazon Management and Governance (CloudWatch, CloudFormation)
-
Amazon Media Services
-
Amazon Migration and Transfer
-
Amazon Mobile
-
Amazon Networking & Content Delivery
-
Amazon Robotics
-
Amazon Satellite
-
Amazon Security, Identity, & Compliance
-
Amazon Storage (S3 bucket)
Amazon Elastic Beanstalk
To create an application on the Elastic Beanstalk, you must create an environment in which the app will run. Launching an environment fires up the following resources:
-
EC2 instance - an Amazon Elastic Compute Cloud (Amazon EC2) virtual machine configured to run web apps on the platform your choose.
-
Instance security group - An Amazon EC2 security group configured to allow traffic on port 80.
-
Amazon S3 bucket - A storage location for your source code, logs, and other artifacts.
-
Amazon CloudWatch alarms - Two CloudWatch alarms that monitor the load on the instances in your environment and are triggered if the load is too high or too low.
-
AWS CloudFormation stack - Elastic Beanstalk uses AWS CloudFormation to launch the resources in your environment.
-
Domain name - A domain name to your web app in the form subdomain.region.elasticbeanstalk.com.
Platforms available on the Elastic Beanstalk:
Ruby, PHP, Java, Node.js, Python
Amazon Database Services
Amazon (Relational Database Service) RDS
Relational databases store data with pre-defined tables and relationships between the tables. RDBMS are designed to support ACID transactions, maintain referential integrity, and data consistency.
Used for: Traditional applications, ERP, CRM, and e-commerce.
Having the database decoupled from the EC2 environment is a better solution because the database will not share the lifecycle of the applications environment.
Amazon RDS includes the following Engine options:
-
Amazon Aurora
-
MySQL
-
MariaDB
-
PostgreSQL
-
Oracle
-
Microsoft SQL Server
Amazon DynamoDB
Key-value databases are optimized to store and retrieve key-value pairs in large volumes and in milliseconds, without the performance overhead and scale limitations of relational databases.
Used for: Internet-scale applications, real-time bidding, shopping carts, and customer preferences.
Amazon DocumentDB (with MongoDB compatibility)
Document databases are designed to store semi-structured data as documents and are intuitive for developers to use because the data is typically represented as a readable document.
Used for: Content management, personalization, and mobile applications.
Amazon ElastiCache for Redis and Memcached
In-memory databases are used for applications that require real time access to data. By storing data directly in memory, these databases provide microsecond latency where millisecond latency is not enough.
Used for: Caching, gaming leaderboards, and real-time analytics.
Amazon Neptune
Graph databases are used for applications that need to enable millions of users to query and navigate relationships between highly connected, graph datasets with millisecond latency.
Used for: Fraud detection, social networking, and recommendation engines
Amazon Timestream
Time series databases are used to efficiently collect, synthesize, and derive insights from enormous amounts of data that changes over time (known as time-series data).
Used for: IoT applications, DevOps, and industrial telemetry.
Amazon Quantum Ledger Database
Ledger databases are used when you need a centralized, trusted authority to maintain a scalable, complete and cryptographically verifiable record of transactions.
Used for: Systems of record, supply chain, registrations, and banking transactions.