Big Data Analytics

What is Big Data Analytics?

Big data analytics describes the process of uncovering trends, patterns, and correlations in large amounts of raw data to help make data-informed decisions.

These processes use familiar statistical analysis techniques—like clustering and regression—and apply them to more extensive datasets with the help of newer tools.

How big data analytics works

1. Collect Data

Data collection looks different for every organization. With today’s technology, organizations can gather both structured and unstructured data from a variety of sources — from cloud storage to mobile applications to in-store IoT sensors and beyond. Some data will be stored in data warehouses where business intelligence tools and solutions can access it easily. Raw or unstructured data that is too diverse or complex for a warehouse may be assigned metadata and stored in a data lake.

2. Process Data

Once data is collected and stored, it must be organized properly to get accurate results on analytical queries, especially when it’s large and unstructured. Available data is growing exponentially, making data processing a challenge for organizations. One processing option is batch processing, which looks at large data blocks over time. Batch processing is useful when there is a longer turnaround time between collecting and analyzing data. Stream processing looks at small batches of data at once, shortening the delay time between collection and analysis for quicker decision-making. Stream processing is more complex and often more expensive.

3. Clean Data

Data big or small requires scrubbing to improve data quality and get stronger results; all data must be formatted correctly, and any duplicative or irrelevant data must be eliminated or accounted for. Dirty data can obscure and mislead, creating flawed insights.

4. Analyze Data

Getting big data into a usable state takes time. Once it’s ready, advanced analytics processes can turn big data into big insights. Some of these big data analysis methods include:

Data mining sorts through large datasets to identify patterns and relationships by identifying anomalies and creating data clusters.

Predictive analytics uses an organization’s historical data to make predictions about the future, identifying upcoming risks and opportunities.

Deep learning imitates human learning patterns by using artificial intelligence and machine learning to layer algorithms and find patterns in the most complex and abstract data.

Big data is a term, used to refer data sets that are too large or complex. For processing of this type of data sets use special type of application software. Big data was originally associated with three key concepts: Volume, Variety and Velocity.

Characteristics Big data can be described by the following characteristics:


Volume defines the quantity of generated and stored data. The size of the data determines its value and its type to understand whether data can be considered as Big data or not.


Variety defines the type and nature of the data. This helps user to effectively use that data. Big data is combination of text, images, audio and video.


Velocity defines the speed at which the data is generated and processed to fulfill the demands and challenges. Big data is often available in real-time. Compared to small data, big data are produced more continually. Two types of velocity related to big data are the frequency of generation and the frequency of handling, recording, and publishing.  

Big Data Types

Mainly, there are three types of Big Data, as given below:

Structured Data:- The structured data can be stored in a tabular column. Examples of structured data are Relational databases.

Unstructured Data:- The unstructured data can be stored in a tabular column. Examples of unstructured data are audio, video etc.

Semi-structured Data:- The semi-structured data contains both structured and unstructured data. Examples of Semi-structured Data are XML data, JSON files, and others.

Qus. What is Computer Ethics?

Ans. Computer ethics deals with the procedures, values and practices that govern the process of consuming computing technology and its related disciplines without damaging or violating the moral values and beliefs of any individual, organization or entity.

Qus. What is Software Piracy? How does shareware deal with software piracy?

Ans. Software piracy is used to describe the act of illegally using, copying or distributing software without ownership or legal rights. Shareware is a good way to market software. It allows consumers to evaluate an application prior to making a purchase decision. They can easily determine if it meets their business or personal needs, which usually results to a satisfied customer.

Qus . Why do you think in developing countries like India, it is difficult to stop software piracy?

Ans. Software piracy is hard to stop for several reasons:

a. It’s not a violent crime, so eliminating the criminals by killing them is not really an option. As such, the only remaining option is a legal approach - charging them with a crime and threatening them with prison if they don’t quickly agree to a guilty plea.

b. Software piracy is very easy. If you want your software to work without internet connection, then you need to put the entire code and all the data into the hands of a users; consequently, any anti-piracy measures have to be in their hands as well, making it possible for smart coders to reverse-engineer your code, find the part which performs verification of “license to use”, and remove or twist that part so that the software works without a license.

c. Sharing cracked software is ludicrously easy and hard to detect.

d. There are millions of people involved in this. Quite literally, not only your prisons, but your court-rooms as well are not big enough to realistically charge everyone. So you have to prioritize who you will go after.

e. Experts still can’t agree to what extent is piracy harmful. There were numerous cases where a pirated version of something was instrumental in making it popular, which not only later led people to buy a legal copy, but generated interest in a sequel, which then became massive hit.

f. People who can afford buying movies and games generally do so; people who cannot afford it will either pirate them, or not get them at all.

g. The price tag associated with a legal software is generally high.

Qus. What are the different ways of stopping Software Piracy?

Ans. The different ways to stop software piracy are:

• Educate your staff on the licensing requirements of your software purchases

• Conduct a self-audit of your software licenses

• Acquire any licenses needed for full compliance

• The most widely used method is the license key; code that is built into an application to require a valid key to unlock the software.

Qus. How does Spamming affect economically?

Ans. Spamming remains economically viable because advertisers have very little or sometimes no operating costs beyond the management of their mailing lists, and it is almost impossible to hold senders accountable for their mass mailings. On the other hand it costs huge to the sender and may sometime even get dubbed.

Qus. Discuss two main areas of Industrial Property.

Ans. Copyright and Trademark.

Qus. How can spamming be reduced?

Ans. By using these precautions you can greatly mitigate what spam you do receive and prevent most spam from ever happening.

• Be careful where you enter your email at.

• Create or use disposable email addresses for websites you do not trust.

• Never open spam when you receive it.

• Keep your computer virus and malware free.

• If your friends are sending you emails sent to a large recipient list, request that they use BCC instead of TO or CC, so that other recipients cannot see your email address; or request they stop including you if you do not want to receive the emails.

• Do not list your email address on your website or anywhere the public can access it.

Qus. How is phishing and pharming performed to perform Cybercrime?

Ans. Phising is a fraudulent practice of sending emails purporting to be from reputable companies in order to induce individuals to reveal personal information, such as passwords and credit card numbers.

Pharming is the fraudulent practice of directing Internet users to a bogus website that mimics the appearance of a legitimate one, in order to obtain personal information such as passwords, account numbers, etc.

Qus. What are the different types of Cybercrimes?

Ans. The different types of cyber crime are:

• Financial fraud crimes

• Cyberterrorism

• Cyberextortion

• Cyberwarfare

• Computer as a target

• Computer as a tool

Qus. How are Hackers different from Crackers?

Ans. Hackers are those computers experts which breaks into computers to check any vulnerably so that no one can misuse the services. These are really intelligent and smart persons who use there ability to protect the community from cyber crimes and computer thefts.

Whereas, Crackers are those peoples who use there knowledge to do computer crimes for gaining popularity among peoples and to earn fast money. They break into computer networks for their enjoyment and cause harm to them. These persons does not have real knowledge and know something about using the particular software to break into computers.

Qus. What is cloud computing?

Ans: cloud computing is the delivery of computing services—servers, storage, databases, networking, software, analytics, intelligence and more—over the Internet ('the cloud') to offer faster innovation, flexible resources and economies of scale. You typically pay only for cloud services you use, helping lower your operating costs, run your infrastructure more efficiently and scale as your business needs change.

Cloud computing is a big shift from the traditional way businesses think about IT resources. Here are seven common reasons organizations are turning to cloud computing services.

Qus. What is a virus? What is anti-virus software?

Ans. A computer virus is a malicious program that self-replicates by copying itself to another program. In other words, the computer virus spreads by itself into other executable code or documents. The purpose of creating a computer virus is to infect vulnerable systems, gain admin control and steal user sensitive data. Hackers design computer viruses with malicious intent and prey on online users by tricking them.

Antivirus software is a program or set of programs that are designed to prevent, search for, detect, and remove software viruses, and other malicious software like worms, trojans, adware, and more.

Qus.  How is backup utility useful? Is it necessary to take backup of data?

Ans: Backup is a very helpful utility. You can backup your data with that and whenever your data is corrupted by any virus or Trojans your data will remain safe.

It is not necessary to backup your until or unless you have something really important stuff in your machine.The backup depends on your wish if you want to make backup, go on for it and if you don't leave it. But if you have something very useful for you in future then its better to take backup.

Qus. What are different types of threats to computer security?

Ans: A Threat is a potential violation of security. When a threat is actually executed, it becomes attack. Those who execute such actions, or cause them to be executed are called attackers.

Some common threats the average computer user faces everyday are

1. Viruses

2. Worms

3. Trojans

4. Spyware

5. Adware

6. Spamming

7. PC Intrusion:

8. Denial of Service

9. Sweeping

10. Password Guessing

11. Phishing

Qus. What type damages can be caused by viruses to your computer?

Ans: Damages caused by Viruses:

– Damage or Delete files.

– Slow down your computer.

– Invade your email programs.

Qus. What are malware? What type damages can they cause to your computer?

Ans: "Malware" is short for malicious software and used as a single term to refer to virus, spy ware, worm etc. Malware is designed to cause damage to a stand-alone computer or a networked pc. So wherever a malware term is used it means a program which is designed to damage your computer it may be a virus, worm or Trojan.

Qus What is a spam? Why has it become a big Internet issue?

Ans: Spam email is a form of commercial advertising which is economically viable because email is a very cost-effective medium for the sender. If just a fraction of the recipients of a spam message purchase the advertised product, the spammers are making money and the spam problem is perpetuated.

Qus What are denial-of-service or Sweeper attack?

Ans: A denial-of-service attack is a security event that occurs when an attacker prevents legitimate users from accessing specific computer systems, devices, services or other IT resources.

Qus. What is Authentication and Authorization? Why are these two used together?

Ans: Difference between Authentication and Authorization. Both the terms are often used in conjunction with each other in terms of security, especially when it comes to gaining access to the system. Authentication means confirming your own identity, while authorization means granting access to the system.


CCC Online Test 2021 CCC Practice Test Hindi Python Programming Tutorials Best Computer Training Institute in Prayagraj (Allahabad) Best Java Training Institute in Prayagraj (Allahabad) Best Python Training Institute in Prayagraj (Allahabad) O Level NIELIT Study material and Quiz Bank SSC Railway TET UPTET Question Bank career counselling in allahabad Sarkari Naukari Notification Best Website and Software Company in Allahabad Website development Company in Allahabad