Welcome!

Artificial Intelligence Authors: Pat Romanski, Elizabeth White, Liz McMillan, Yeshim Deniz, Valeriia Timokhina

Related Topics: @DevOpsSummit, Containers Expo Blog, @BigDataExpo

@DevOpsSummit: Article

Compression: Making the Big Smaller and Faster (Part 1) | @DevOpsSummit #DevOps #WebPerf

The sharing of information in a fast and efficient manner has been an area of constant study and research

Compression: Making the Big Smaller and Faster (Part 1)
By Nilabh Mishra

How important is data compression? The sharing of information in a fast and efficient manner has been an area of constant study and research. Companies like Google and Facebook have spent a lot of time and effort trying to develop faster and better compression algorithms. Compression algorithms have existed since the ’70s and the ongoing research to have better algorithms proves just how important compression is for the Internet and for all of us.

The Need for Data Compression
The World Wide Web (WWW) has undergone a lot of changes since it was made available to the public in 1991. Believe it or not, the copy of the world’s first website can still be browsed here. Back then, webpages were very simple. Today, they are increasingly more complex and there is an evident need to have compression algorithms that are lossless, fast, and efficient.

There are several best practices that help optimize page load times. Here is a blog from that discusses webpage optimization. In this article, we will spend some time understanding the basics of compression and how it works. We will also cover a new type of compression method called “Brotli” in the second part of this blog.

Encoding and Data Compression
Let’s start by understanding what data encoding and compression are:

The word “compression” comes from the Latin word compressare, which means to press together. “Encoding” is the process of placing a sequence of characters in a specialized format that allows efficient data storage as well as transmission. Per Wikipedia: “Data compression involves encoding information using fewer bits than the original representation.

Compression plays a key role when it comes to saving bandwidth and speeding up your site. Modern day websites involve a lot of HTTP requests and responses between the client (the browser) and the server to serve a webpage. With an overall increase in the number of HTTP requests and responses, it becomes important to ensure that these transfers are taking place at a fast and efficient rate.

HTTP works on a request-response model, as demonstrated below:

In this case, we are not using any compression method to compress the response being sent by the server.

  • The browser sends an HTTP request asking for the Index.html page
  • The server looks for the requested file and responds with the requested resource and a 200 OK HTTP status message
  • The browser receives the server’s response and renders the page

As we can see, in this case there is no compression involved. The server responded with a 300 KB file (index.html page). If the file size was bigger, it would have taken more time for the response to be sent on the wire and this would have increased the overall page load time. Please note that we are currently looking only at a single HTTP response. Modern websites receive hundreds of such HTTP responses from the server to render a webpage.

The image below shows the same HTTP request – response between the browser and the server, but in this case, we use compression to reduce the size of the response being sent by the server to the browser.

Today, complex and dynamic websites generate hundreds of HTTP requests/responses. This made it important to have a system which would ensure fast and efficient data transfer between the server and the browser. This is when compression algorithms like Deflate and Gzip came into existence.

Introduction to Gzip
Gzip is a compression method that is used to make files smaller for storage and faster transmission over the network. Gzip is one of the most popular, powerful, and effective ways of compressing data and it can reduce the file size by up to 70%.

Gzip is based on the DEFLATE algorithm, which in turn is a combination of LZ77 and Huffman coding. Understanding how LZ77 works is essential to understand how compression methods like DEFLATE and Gzip work.

LZ77
Developed in the late ’70s by Abraham Lempel and Jacob Ziv, the LZ77 method of compression looks for sequences of characters that recur in a text. It performs compression by replacing the recurring occurrences of strings using pointers that backreference identical strings, previously encountered in the text, that needs to be compressed.

The pointer or backreference is of the form <relative jump, length>, where relative jump signifies how many bytes are there between the current occurrence of the string and its last occurrence and length is the total number of identical bytes found.

Now let us understand this better with the help of an example. Assume, there is a text file with the following text:

As idle as a painted ship, upon a painted ocean.

In this file, we see the following strings: “as” and “painted” occurring multiple times. What LZ77 method does is, it replaces multiple occurrences of strings with the notation: <relative jump, length>.

So using LZ77, the text will get encoded in the following way:

As idle <8,2> a painted ship, upon a <21,7> ocean.

To encode the text, we took the following steps:

  1. Looked at the string and tried to find occurrences of the same “string” or “substrings”.
  2. Replaced multiple occurrences of a string with the notation: <relative jump, length>; The two strings: “as” and “painted” were replaced the multiple occurrences of the strings with <relative jump, length>.
  3. The string “painted” which would have earlier occupied 7 bytes (i.e. the number of characters in the word: “painted”) X 1 byte = 7 bytes was compressed to occupy only 2 bytes. 2 bytes or 16 bits is the size of the pointer or backreference.

HUFFMAN Coding
Huffman Coding is another lossless data compression algorithm. The frequency of occurrence of a string in a text file or pixels in images form the basis of Huffman coding. To get a deeper understanding of this algorithm, read this detailed tutorial that clearly explains how Huffman Coding works.

All modern browsers support Gzip compression for HTTP Requests. With Gzip, one of the most important question is what to compress. It works best with text-based resources like static HTML, CSS files and JavaScript resources but is not very efficient for already compressed resources such as Images. To support Gzip, the server must be configured to allow gzip compression.

The image above shows the impact Gzip compression can have on a text-based resource like a JavaScript file. In this case, we ran 2 instant tests using Catchpoint to the URL: https://code.jquery.com/jquery-3.2.1.js.

For the first test run, we did not specify any encoding to be used by passing the custom header: Accept-Encoding: identity along with the request. The first image shows no Content-Encoding being passed for the request.

In the second image, the browser is sending Accept-Encoding:zip, for which the server is sending zipped file as the response.

We can clearly see how Gzip can drastically compress the files to improve data transmission rate over the wire.

Catchpoint’s Scheduled tests also highlight the difference between compressed and not-compressed content loading on webpages.

In the screenshot above, we see the difference in downloaded bytes for static content (CSS, JavaScript) when using G-zip vs. when not using any encoding.

Brotli Compression
A new compression method called Brotli was introduced not too long ago. The Brotli compression algorithm is optimized for the web and specifically for small text documents. We will discuss more about this compression method and what is has to offer to the World Wide Web community in the second part of the article.

The post Compression: Making the Big Smaller and Faster (Part 1) appeared first on Catchpoint's Blog - Web Performance Monitoring.

More Stories By Mehdi Daoudi

Catchpoint radically transforms the way businesses manage, monitor, and test the performance of online applications. Truly understand and improve user experience with clear visibility into complex, distributed online systems.

Founded in 2008 by four DoubleClick / Google executives with a passion for speed, reliability and overall better online experiences, Catchpoint has now become the most innovative provider of web performance testing and monitoring solutions. We are a team with expertise in designing, building, operating, scaling and monitoring highly transactional Internet services used by thousands of companies and impacting the experience of millions of users. Catchpoint is funded by top-tier venture capital firm, Battery Ventures, which has invested in category leaders such as Akamai, Omniture (Adobe Systems), Optimizely, Tealium, BazaarVoice, Marketo and many more.

@ThingsExpo Stories
There is huge complexity in implementing a successful digital business that requires efficient on-premise and cloud back-end infrastructure, IT and Internet of Things (IoT) data, analytics, Machine Learning, Artificial Intelligence (AI) and Digital Applications. In the data center alone, there are physical and virtual infrastructures, multiple operating systems, multiple applications and new and emerging business and technological paradigms such as cloud computing and XaaS. And then there are pe...
SYS-CON Events announced today that MIRAI Inc. will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. MIRAI Inc. are IT consultants from the public sector whose mission is to solve social issues by technology and innovation and to create a meaningful future for people.
SYS-CON Events announced today that Keisoku Research Consultant Co. will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Keisoku Research Consultant, Co. offers research and consulting in a wide range of civil engineering-related fields from information construction to preservation of cultural properties. For more information, vi...
SYS-CON Events announced today that Massive Networks, that helps your business operate seamlessly with fast, reliable, and secure internet and network solutions, has been named "Exhibitor" of SYS-CON's 21st International Cloud Expo ®, which will take place on Oct 31 - Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. As a premier telecommunications provider, Massive Networks is headquartered out of Louisville, Colorado. With years of experience under their belt, their team of...
SYS-CON Events announced today that Enroute Lab will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Enroute Lab is an industrial design, research and development company of unmanned robotic vehicle system. For more information, please visit http://elab.co.jp/.
SYS-CON Events announced today that Ryobi Systems will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Ryobi Systems Co., Ltd., as an information service company, specialized in business support for local governments and medical industry. We are challenging to achive the precision farming with AI. For more information, visit http:...
What is the best strategy for selecting the right offshore company for your business? In his session at 21st Cloud Expo, Alan Winters, U.S. Head of Business Development at MobiDev, will discuss the things to look for - positive and negative - in evaluating your options. He will also discuss how to maximize productivity with your offshore developers. Before you start your search, clearly understand your business needs and how that impacts software choices.
Real IoT production deployments running at scale are collecting sensor data from hundreds / thousands / millions of devices. The goal is to take business-critical actions on the real-time data and find insights from stored datasets. In his session at @ThingsExpo, John Walicki, Watson IoT Developer Advocate at IBM Cloud, will provide a fast-paced developer journey that follows the IoT sensor data from generation, to edge gateway, to edge analytics, to encryption, to the IBM Bluemix cloud, to Wa...
SYS-CON Events announced today that Fusic will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Fusic Co. provides mocks as virtual IoT devices. You can customize mocks, and get any amount of data at any time in your test. For more information, visit https://fusic.co.jp/english/.
SYS-CON Events announced today that B2Cloud will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. B2Cloud specializes in IoT devices for preventive and predictive maintenance in any kind of equipment retrieving data like Energy consumption, working time, temperature, humidity, pressure, etc.
SYS-CON Events announced today that NetApp has been named “Bronze Sponsor” of SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. NetApp is the data authority for hybrid cloud. NetApp provides a full range of hybrid cloud data services that simplify management of applications and data across cloud and on-premises environments to accelerate digital transformation. Together with their partners, NetApp em...
Elon Musk is among the notable industry figures who worries about the power of AI to destroy rather than help society. Mark Zuckerberg, on the other hand, embraces all that is going on. AI is most powerful when deployed across the vast networks being built for Internets of Things in the manufacturing, transportation and logistics, retail, healthcare, government and other sectors. Is AI transforming IoT for the good or the bad? Do we need to worry about its potential destructive power? Or will we...
SYS-CON Events announced today that SIGMA Corporation will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. uLaser flow inspection device from the Japanese top share to Global Standard! Then, make the best use of data to flip to next page. For more information, visit http://www.sigma-k.co.jp/en/.
Agile has finally jumped the technology shark, expanding outside the software world. Enterprises are now increasingly adopting Agile practices across their organizations in order to successfully navigate the disruptive waters that threaten to drown them. In our quest for establishing change as a core competency in our organizations, this business-centric notion of Agile is an essential component of Agile Digital Transformation. In the years since the publication of the Agile Manifesto, the conn...
SYS-CON Events announced today that Nihon Micron will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Nihon Micron Co., Ltd. strives for technological innovation to establish high-density, high-precision processing technology for providing printed circuit board and metal mount RFID tags used for communication devices. For more inf...
SYS-CON Events announced today that Suzuki Inc. will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Suzuki Inc. is a semiconductor-related business, including sales of consuming parts, parts repair, and maintenance for semiconductor manufacturing machines, etc. It is also a health care business providing experimental research for...
SYS-CON Events announced today that Daiya Industry will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Daiya Industry specializes in orthotic support systems and assistive devices with pneumatic artificial muscles in order to contribute to an extended healthy life expectancy. For more information, please visit https://www.daiyak...
The Internet giants are fully embracing AI. All the services they offer to their customers are aimed at drawing a map of the world with the data they get. The AIs from these companies are used to build disruptive approaches that cannot be used by established enterprises, which are threatened by these disruptions. However, most leaders underestimate the effect this will have on their businesses. In his session at 21st Cloud Expo, Rene Buest, Director Market Research & Technology Evangelism at Ara...
SYS-CON Events announced today that N3N will exhibit at SYS-CON's @ThingsExpo, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. N3N’s solutions increase the effectiveness of operations and control centers, increase the value of IoT investments, and facilitate real-time operational decision making. N3N enables operations teams with a four dimensional digital “big board” that consolidates real-time live video feeds alongside IoT sensor data a...
Internet of @ThingsExpo, taking place October 31 - November 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA, is co-located with 21st Cloud Expo and will feature technical sessions from a rock star conference faculty and the leading industry players in the world. The Internet of Things (IoT) is the most profound change in personal and enterprise IT since the creation of the Worldwide Web more than 20 years ago. All major researchers estimate there will be tens of billions devic...