Friday, 7 August 2020

Exploring Java 8 Features


Many companies rely on Java 7. But You know When Java 8 was released? It's almost 6 years back !! Yeah, Java 8 was released by March 18, 2014! Still not adopted by many! Let's explore one Java 8 Feature every day in our upcoming posts!

Java 8 Features

  • Functional Interfaces and Lambda Expressions
  • Java Stream API for Bulk Data Operations on Collections
  • forEach() method in Iterable interface
  • default and static methods in Interfaces
  • Java Time API
  • Collection API improvements
  • Concurrency API improvements
  • Java IO improvements
  • Miscellaneous Core API improvements

Tuesday, 9 June 2020

Self Sovereign Identity

Every day we are seeing a lot of identity breaches. Starting from Facebook data leak to Aadhar information leak in India, the identity of people come under scanners. We all deserver to own our identity, control the usage without involving third parties. In a physical world, we all have a unique set of information to prove we are the citizen of the particular country, we know driving, we are eligible to vote via documents like passport, driving license, Voter ID, etc. We own the identity proofs and we can keep them safe but when it comes to the digital world, we are dealing with a different scenario.

 Current Digital Identity has two major problems. First, we don’t own the identity. We identify ourselves with the username/passwords or by logging into different systems using the SSO Authentication from third party organizations like Google, Facebook, etc. Thus, our identity is not owned by us. We don’t control the fact, how our identity is used. The next big problem is oversharing of information. If you are going to vote, we are supposed to prove that we are 18+. But the voter ID reveals other unnecessary information like date of birth, address, etc. This problem is witnessed in both physical and digital identities.

 Self Sovereign Identity came as a one stop solution to solve these problems. It combines attributes from different credentials and presents them as a single proof. It relies on Zero Knowledge Proof. Thus, the proof is just going to reveal Yes/No answers.  If the question is, are you eligible to vote, the proof will give only Yes/No as the answer. The identity proof is presented in a way that, the verifier can verify the authenticity of the credentials like the credential issuer, its uniqueness, integrity and can ensure that its jot tampered or revoked without contacting the issuer.

For Example, if you want to vote, the issuer(government) will put the public key in the ledger store and issue the unique token to you. When you reach the voting booth, the verifier can verify if that’s you, just by checking the data in the ledger store. This ledger store is not a centralized authority. It is not run by any single organization. We call this ledger store as Sovereign Ledger which is tamper resistance and ordered chronologically.

The relationship between voter booth and you are made only once. This is unique. Again, consider a case, if you go to the bank, you will make another unique relationship by showing them the possession of the credentials. The connection setup and credential exchange happens off-ledger, privately, without involving the third parties. Finally, you will be provided with a digital token by the bank after authorization. After getting the credential and the relationship, no requirement to use a username/password. No login and nothing. Just by proving the possession of the credentials and the connection you setup, you are going to say, its me, and here is the digital proof.

By establishing a peer to peer connection, we are safe from any kind of Man in the Middle attacks. To Make it work, we need some open protocols and standards. Several Organizations around the world came forward to maintain the standard ledger abiding by certain principles and rules to ensure that, the identity control will be with people themselves. This factor separates the sovereign from bitcoins and Ethereum. Here, Hyperledger Community comes to the picture. Stay tunes for next write ups!

Tuesday, 2 June 2020

Exploring Hazlecast Jet Runner

Apache Beam is an open-source, unified model for defining both batch and streaming data-parallel processing pipelines. Using one of the open-source Beam SDKs, users can build a program that defines the pipelineThe Beam Pipeline Runners translate the data processing pipeline that user-defined with his Beam program into the API compatible with the distributed processing back-end of his choice.
A Beam Runner runs a Beam pipeline on a specific (often distributed) data processing system. Available runners are listed below:
·        DirectRunner: Runs locally on your machine
·        Apex Runner: Runs on Apache Apex.
·        FlinkRunner: Runs on Apache Flink.
·        SparkRunner: Runs on Apache Spark.
·        DataflowRunner: Runs on Google Cloud Dataflow
·        GearpumpRunner: Runs on Apache Gear pump (incubating).
·        SamzaRunner: Runs on Apache Samza.
·        NemoRunner: Runs on Apache Nemo.
·        JetRunner: Runs on Hazelcast Jet.

The vision of Beam is to support: End Users who want to write pipelines in the language of their choice, SDK Writers who wish to unleash the power of beam through various new languages and finally Runner Writers who has a distributed processing environment and looking forward to supporting the Beam pipelines.

The Hazelcast Jet Runner is one such runner that can be used to execute Beam pipelines using Hazelcast Jet. It allows the user to write a modern Java code that focuses purely on data transformation while it does all the heavy lifting of getting the data flowing and computation running across a cluster of nodes. It supports working with both bounded (batch) and unbounded (streaming) data.

As a part of my course work, I decided to specialize in the track of distributed systems. After completing Distributed Systems , I have enrolled for Advanced Distributed Systems course as well which gave me an interesting opportunity to develop a streaming query that analyses data from the Linear Road Benchmark and I deployed that query in a Flink cluster.

This was the initial spark that triggered my interest in Data Streaming, and I continued to explore Apache Flink, Apache Spark, Apache Samza, and their runner support to Apache Beam. While diving deep into Beam Pipeline Runners, the conference talks about Apache Flink runner for Beam and samza portable runner for Beam gave me an architectural insight about Beam portable runners. Recently I worked on these Distributed Computing Projects, and I gained some hands-on experience with basic data streaming modules. I have also developed a Blackboard to implement Strict, Loose, and eventual consistency models as a part of my distributed systems course work.

I was trying to explore more into the field of Distributed Systems. Finally, I found this interesting DAG-based distributed computing Java library, Jet Runner for building fault-tolerant and elastic data processing pipeline that can distribute DAG tasks across cores and nodes to run in parallel. One other interesting feature of JET is the use of application-level cooperative threads that enable efficient parallelism without any overhead of context switching in OS-level threads. Thus, high-end performance is guaranteed by Jet with no external planning requirement.

