Posts Tagged - google

Google Cloud Developer Certification - Index

google-cloud-developer-image

This are personal notes for the GCP Developer certification. If you want to get ready, I fully recommend doing Qwiklabs and Coursera courses to prepare yourself.

Google Cloud Platform (GCP) Fundamentals: Core Infrastructure 1. Introducing Google Cloud Platform
2. Getting started with GCP
3. Virtual machines in the cloud
4. Storage in the cloud
5. Containers in the cloud
6. Applications in the cloud
7. Developing in the cloud
8. Big Data in the cloud
9. Machine Learning in the cloud

Getting started with Application Development 1. Best practices for app development
2. Google Cloud SDK, Client Libraries and Firebase SDK
3. Data Storage Options
4. Best practices for Cloud Datastore
5. Best practices for Cloud Storage

Securing and Integrating Components of your Application 1. Cloud IAM (Identity and Access Management)
2. OAuth2.0, IAP and Firebase Authentication
3. Cloud Pub/Sub (needs cleaning)
4. Cloud Functions (needs cleaning)
5. Cloud Endpoints (needs cleaning)

App deployment, Debugging and Performance 1. Deploying Applications (needs cleaning)
2. Execution Environments for your App (needs cleaning)
3. Debugging, Monitoring and Tuning Performance (needs cleaning)

Course Qwiklabs 1. Setting up a development environment

Extra Qwiklabs

Using the Cloud SDK Command Line Link to course

1. Getting started with Cloud Shell and gcloud
2. Configuring networks with gcloud
3. Configuring IAM permissions with gcloud
4. gsutil commands for Buckets
5. gsutil commands for BigQuery

Read More

Debugging, Monitoring and Tuning Performance

Google Stackdriver

Google Stackdriver is a multi-cloud service.

Stackdriver combines metrics logs and meta data whether you’re running on GCP, AWS, on premises infrastructure or a Hybrid Cloud. You can quickly understand service behaviour and issues from a single comprehensive view of your environment and take action if needed.

Stackdriver includes features such as

  • logging
  • monitoring
  • trace
  • error reporting
  • debug

Read More

Execution environments for your App

GCP allows you to choose an app environment that matches the needs of your app. More control over the infrastructure usually implies greater operational burden.
If your needs change, it isn’t a problem. As long as you use Google Client Libraries in your app to work with GCP services, you can usually move to another execution environment.

type execution environment
Fully Managed Cloud Dataflow
  Cloud Functions
  Cloud Run
  App Engine flexible env
  GKE
Highly Customizable Compute Engine

Summary

GCP provides a range of options to run your app code.

Cloud Dataflow and Cloud Run are fully managed.
App Engine, GKE and Compute Engine offer steadily increasing abilities to customize your execution environment.

Fully managed compute environments require minimal setup and operations. On the other hand hihgly customizable environments require greater operational and management efforts to keep the application running optimally.

Read More

Deploying applications

Implement CI / CD for reliable releases

Continuous Integration

CI is a developer workflow in which developers frequently pull from the master and commit their changes into a feature branch. These commits triggers are built in a build system such as Jenkins.

The build process creates a new application image by using Cloud Build and stores it in an artifact repository such as container registry. A deployment system such as Spinnaker deploys the artifacts in your cloud environment.

You can use deployment manager to stand up the resources for the managed services that your app needs.

After your app is deployed in your development environment, you can automatically run tests to verify your code.

If all test pass, you can join the feature branch into master.

Read More

Cloud Functions

Cloud Functions make it possible to run a completely serverless environment, where logic can be executed on demand and in response to events. They act as a glue between disparate applications.

With them, you can build a serverless microservices architecture that’s highly scalable and focus only on the code and not worry about setting up servers. They can be used in a variety of cases that require lightweight microservices, event-driven processing or webhooks.

GCP Services emit events such as when files are uploaded to a Cloud Storage bucket or messages are published to a Cloud Pub/Sub topic.

They can invoke APIs or write data back to the Cloud and they’re ideal for microservices that require a small piece of code to quickly process data related to an event.

They’re priced according on how long your function runs, the number of times it’s invoked and the resources that you provision for the function.

They can be called asynchronously to events or synchronously by direct HTTP requests. For HTTP functions, it’s important to keep the execution time to a minimum.

Cloud functions have a default timeout value of 60 seconds.

Read More

Cloud Pub/Sub

Fully managed real-time messaging architecture that enables you to build loosely coupled microservices, that can communicate asynchronously. You can use it to integrate components of your app. It enables your app to perform operations asynchronously and be loosely coupled and to build your app with open multi-cloud or hybrid architectures.

It delivers each message to every subscription at least once. A publisher can sometimes see duplicated messages.
You can send/receive Cloud Pub/Sub messages programmatically, open REST HTTP or gRCP service APIs, and an Apache Kafka connector.

It scales automatically depending on the volume of messages and enables you to securely integrate distributed systems.

Read More

Cloud Endpoints

Cloud Endpoints enable you to deploy and manage your APIs on GCP. They enable you to implement an API Gateway.

Implement an API Gateway

An API gateway creates a layer of abstraction and insulates the clients from the partitioning of an application into microservices.

The API for your app can run on backends such as App Engine, GKE or Compute Engine.

They may work as a facade for legacy applications that cannot be refactored and moved to the cloud. Each consumer can then invoke these modern APIs instead of invoking outdated APIs.

Usign the Apigee platform

Read More

OAuth2.0, Cloud IAP and Firebase Authentication

Generally it’s best to use a service account authentication to a GCP API.

Use OAuth2.0 to access resources on behalf of a user

Use cases

  • your app needs access to BigQuery datasets that belong to users
  • your app needs to authenticate as a user to create projects on their behalf
  1. Your app will request access to the resources.
  2. The user will be prompted for consent
  3. If consent is provided, the app can request credentials from an authorization server.
  4. The app can then use those credentials to access resources on behalf of the user

Read More

Cloud IAM (Identity and Access Management)

The Firebase SDK makes it really easy to implement Federated Identity Management.

Cloud IAM lets you manage access control by defining who (members) has what access (role) for which resource. You can grant more granular access to GCP Resources using the least privilege principle. Only grant access to resources that is necessary.

Cloud Platform Resources are organized hierarchically where the organization node is the root node in the hierarchy. The projects are the children of the organization and the other resources are the children of the projects. Each resource has exactly one parent.

Read More

Best practices for Cloud Storage

Cloud Storage is ideal for

  • Store and serve static content such as HTML, CSS and JS
  • Store and retrieve a large number of files
  • Store multiterabytes files

Resources are entities including:

  • projects
  • buckets - the basic container.
    • they contain mutable objects
    • bucket names need to be unique in the whole world.
    • they may contain dots if it’s a domain
  • objects - individual piece of data

Read More

BigQuery CLI usage

Check: Get meaningful insights with BigQuey

BigQuery offers a number of sample tables that you can run queries against. This examples run queries against the shakespeare table, which contains an entry for every word in every play.

General help

# see a list of all bq commands  
bq help  

# show info for query command  
bq help query  

Search info

# examine the shakespeare table  
bq show bigquery-public-data:samples.shakespeare

# query to see how many times the substring "raisin" appears in Shakespeare's works.
bq query --use_legacy_sql=false \
    'SELECT word, SUM(word_count) AS count
     FROM `bigquery-public-data`.samples.shakespeare
     WHERE word LIKE "%raisin%"
     GROUP BY word'

 # see the top 5 most popular names  
 bq query "SELECT name,count
             FROM babynames.names2010
             WHERE gender = 'F'
             ORDER BY count DESC
             LIMIT 5"


# list any existing dataset in your project
bq ls

Create queries and upload a dataset

# create a new dataset named babynames
bq mk babynames

Before you can build a table, you need to add the dataset to your project. The custom data file you’ll use contains approximately 7 MB of data about popular baby names, provided by the US Social Security Administration.

The bq load command creates or updates a table and loads data in a single step.

# create the table
bq load babynames.names2010 yob2010.txt name:string,gender:string,count:integer

this is the equivalent to

datasetID: babynames  
tableID: names2010  
source: yob2010.txt  
schema: name:string,gender:string,count:integer  
# confirm the table appears
bq ls babynames

# see the table schema
bq show babynames.names2010

# remove table
bq rm -r babynames

Read More

gsutil commands for Buckets and Objects

System management

# get project id  
gcloud config get-value project

# set environmental vars  
PROJECT_ID=`gcloud config get-value project`  
BUCKET=mariocodes-personal-bucket

Buckets and objects operations

List, download and sync buckets. Upload files

# gsutil modificators
#   -m multithread

# list buckets and objects in your bucket
gsutil ls
gsutil ls gs://${BUCKET}

# check storage classes
gsutil ls -Lr gs://${BUCKET} | more

# download the whole bucket
gsutil -m cp -R gs://${BUCKET} .

# sync local folder with bucket content
gsutil -m rsync -d -r local-folder gs://${BUCKET}
#  use it with the whole root local folder
# -d delete files on target if they're missing on local
# -r recursive

# upload a file with nearline storage.
gsutil cp -s nearline thisanearlinefile gs://${BUCKET}

For more info on file storage classes check here

Modify objects

# make all objects in a folder public
gsutil -m acl set -R -a public-read gs://${BUCKET}/folder
# to confirm they're public go to
# http://storage.googleapis.com/<your-bucket-name>/folder/old.txt on a web-browser

Create and delete buckets

# Create a bucket and a multi-regional class store  
gsutil mb -c multi_regional gs://${BUCKET}

# delete bucket with object
gsutil rm -rf gs://${BUCKET}
# delete an empty bucket
gsutil rb gs://${BUCKET}

Read More

Best practices for Cloud Datastore

Cloud Datastore is a fully managed, NoSQL database service that you can use to store structured or semi-structured app data and is scalable. It can store from zero to millions of requests per second.

If you need to execute ad hoc queries on large data sets without previously defined indexes, you should use Google BigQuery instead.

It has a dependency on creating an App Engine application.
There’s a query language called GQL that you can use to query entities in Datastore. It’s very similiar to SQL.

Ancestor queries of entity groups give you a strongly consistent view of the data. By creating entity groups, you can ensure that all related entities can be updated in a single transaction.

Cloud Datastore automatically builds indexes for individual properties in an entity. To enable more complex queries with filters on multiple properties, you can create composite indexes.

It can scale with zero downtime. Make sure to ramp up traffic gradually. A general guideline is to use the 500-50-5 to ramp traffic up. This means a base write rate of 500 writes per second and increase it by 50% every 5 minutes. Distribute your writes across a key range.

Read More

Data Storage Options

The full suite of storage service options, varying on your application and workload are
(Check resume graphic at 2:20)

Product Simple description Ideal for
Cloud Storage binary/object store large or rarely accessed unstructured data
Datastore NoSQL. Scalable store for structured serve structured pure-serve use cases
Firestore NoSQL. Cloud-native app data at global scale real time NoSQL database to store and sync data
BigTable NoSQL. High volume low latency database heavy read/write or analytical data
CloudSQL SQL. VM RDBMS Web frameworks, existing apps
Spanner SQL. Relational DB system low latency transactional systems
BigQuery Auto-scaling analytic data warehouse interactive analysis of static datasets

Read More

Configuring IAM Permissions via gcloud

Identity access management (IAM) lets you manage access control by defining who (identity) has what access (role) for which resource.

With Cloud IAM it’s possible to grant granular access to specific GCP Resources and prevent unwanted access to other resources.

Identities

Members can be of the following types

  • Google account
  • Service account
  • Google group
  • G Suite Domain
  • Cloud Identity Domain

more information on this here

Read More

Configuring Networks via gcloud

A virtual private cloud (VPC) network is a global resource which consists of a list of regional subnetworks (subnets) in data centes, all connected by a global wide network (WAN).

VPC networks are isolated from each other. They provide functionality to Compute Engine VMs, GKE and App Engine.

Each GCP Project has a default network config. which provides each region with an auto subnet network.

Create a network

You can choose to create a VPC network with auto mode or custom mode
You can create up to 4 addition networks in a project. Each of them must have a unique name inside the project.

# creates a network called labnet
gcloud compute networks create labnet --subnet-mode=custom

Read More

Google Cloud SDK, Google Cloud Client Libraries & Firebase SDK

Cloud libraries are the preferred method to invoke GCP API’s. You can use the API Explorer as a sandbox, to try them out. They’re available in many prog. languages. With them, you can write application code that can be executed in a compute env such as App Engine, GKE or Compute Engine.

With Firebase SDK you can implement federated identity management to authenticate users without writing it from scratch.
When you’re ready to write your app, use Google Cloud Client Libraries to programmatically interact with GCP Services.
If you need to write scripts to work with GCP Services, use the Google Cloud SDK. It contains command line tools to help you out.

Read More

Getting started with Cloud Shell and gcloud

Cloud Shell is a Debian-base virtual machine with a persistent 5GB home directory, which makes it easy to manage GCP projects and resources.

The gcloud command is a part of Google Cloud SDK. You need to download and install it on your own system and initialize it. It’s pre-installed in Google Cloud Shell to access to computing resources hosted on the GCP.

For more in-depth resources, check here
Google Cloud Shell docs
gcloud tool docs

Read More

Best practices for app development

Loosely coupled Microservices and API Gateways

Applications that run in the cloud must be built for global reach, scalability, and high availability and security.
It should be responsive and accessible to users across the world and be able to handle high traffic volumes reliably.

Manage your app’s code and environment

Store your apps code in a code repository. Use dependency managers. Don’t store external dependencies such as external packages in your repo. Separate your configuration settings from your code.

Read More

Machine Learning in the cloud

GC Machine Learning Platform

ML is a branch of the field of artificial intelligence. It’s a way of solving a problem without explicitly coding the solution. Instead, humans build systems that improve themselves over time through repeated exposure to training data (sample data). This is now available as a cloud service to add innovative capabilities to your apps.

It provides modern machine learning services with pre-trained models and a platform to generate your own tailored models.

TensorFlow

Is an open source library that’s suited for machine learning like neural networks. It needs lots of resources and training data. It can take advantage of Tensor Processing Units, which are hardware devices designed to accelerate machine learning workloads. GCP makes them available with Compute Engine VMs.

Read More

Big Data in the cloud

GC Big Data Platform

All this services together are called Integrated serverless platform. Serverless means you don’t have to worry about providing Compute Instances to run your jobs. The services are fully managed. You pay only for the resources you consume.

The platform is integrated so that GCP work together to create custom solutions.

Cloud Dataproc

It is a managed Apache Hadoop, which is an open source framework for big data. It is great for when you have a data set of known size or when you want to manage your cluster size yourself.

Read More

Developing in the cloud

Cloud Source Repositories

Is a way to keep code private to a GCP project and use IAM permissions to protect it, but not have to maintain the Git instance yourself. It provides Git version control for services, including the ones that run on App Engine, Compute Engine and Kubernetes Engine.

You can have any number of private Git repositories. It contains a source viewer.

Cloud Functions

Avoids the need to integrate several functions into your application for event-driven parts. This need is solved by writing a single purpose function that can automatically react to events.

With this you don’t have to worry about servers or runtime binaries. You just write your code for a GCP environment that GCP provides and then configure when it should fire.

Read More

Applications in the cloud

App Engine

Two GCP products provide compute infrastructure for applications:

  • Compute Engine (based on VMs)
  • Google Kubernetes Engine (based on containers)

What they have in common is that you choose the infrastructure in which your app runs. If you don’t want to focus on infrastructure at all and focus only in your code, that’s what App Engine is for.

PaaS. App Engine manages your hardware and network infrastructure. To deploy an app, you just give App Engine your code and the service takes care of the rest. It provides you with built-in services that many web apps need such as NoSQL databases, in-memory caching, load balancing, health checks, logging, authentication.

Read More

Containers in the cloud (Kubernetes, GKE & Anthos)

Post with Docker definitions and usage

Google Engines

Compute Engine

GCP IaaS (Infrastructure as a Service) offering.

It lets you run VMs in the cloud. Gives you persistent storage, networking and lets you share computer resources by virtualizing the hardware. Each VM has its own instance of an Operating System and you can build an run apps on it. A disadvantage of this is, the smallest unit is a whole VM together with its application. This means it will be harder to scale if it grows.

App Engine

GCP PaaS (Platform as a Service) offering.

Instead of a blank VM you get access to a family of services that apps need. All you do is write your code and self-contained workloads that use these services and include any dependant libraries. As demand increases, your app scales seamlessly. This scales rapidly but you give up control of the underlying server architecture.

Read More

Storage in the cloud

There are different storage database solutions appart from your VMs persistent disk.

Types of data storage

Cloud Storage

Large storage for BLOBS. It stores data as a bunch of bytes and gives a unique key to address to it. This keys are often in the form of URLs. It is fully scalable.

Its objects are immutable. It encrypts the data on server-side and encrypts data being sent by HTTPS.

The files are organized into bucket that must have a globally unique name. The global variable -DEVSHELL_PROJECT_ID wich contains the project id is great for that. The bucket may be set into a region or multi-region.

There is object versioning for your buckets. If you don’t turn it on, new always override old. It offers lifecycle management policies so you don’t accumulate junk. It has a limit of 5 TBs per object.

Read More

Virtual machines in the cloud

VPC (Virtual Private Cloud) Network

A VPC interconnects GCP Resources between them and to the internet. The way to start is to define a VPC at your first GCP Project or to choose a default VPC

It allows for

  • segment networks
  • use firewall rules to
    • restrict instances access
    • create static routes to forward traffic to specific destinations

The VPC Networks that you define have global scope. The subnets have regional scope.

Read More

Getting started with GCP

GCP Resource Hierarchy

  • Organization Node (optional)
  • Projects
  • Folders
  • (more recursive folders)
  • Resources

All the resources are stored into projects and those may be organized into folders. All folders and projects may be gathered together under an organization node.

Read More

Introducing Google Cloud Platform (GCP)

Cloud Computing

On-demand service.
No human intervention.
Provides shared resources to customers.
Offers rapid elasticity.
Pay only for what you consume.

Software waves

  • The first was physical
  • The second virtualized
  • The third serverless

GCP computing architectures

Lets you choose from computing, storage, big data, machine learning and application services. Its global, cost effective, open source friendly and designed for security.

The following are opposite models

Read More