Data represents a variety of useful information that often needs to be stored, sorted, categorized and analyzed to inform decision-making. Data is organized in data structures which represent the data as entities with attributes or characteristics.
Data can be classified as structured, semi-structured or unstructured.
Structured Data
Structured data has a fixed schema where all the data share the same fields and data type for each field. The schema for structured data is usually tabular with columns for the fields and rows for each entity. Structured data is often stored in databases with multiple tables that can reference each other with key values in a relational model.
ID | Name | Surname | |
1 | Naiomi | Naidoo | Naiomi.Naidoo@technology.online |
2 | Firstname | Lastname | Firstname@yahoo.com |
Semi-structured data
Semi-structured data is information that has some structure but there is variation between the entity instances.
Scenario: Some customers may have an email address while others may have multiple email addresses or no email address at all.
JavaScript Object Notation (JSON) is a common data format used for representing semi-structured data because of it’s flexible nature.
//Customer 1 { "id": "1", "name": "Naiomi", "surname": "Naidoo", "contact": { "email": "naiomi@naidoo.com", "phone": "+27121231234" } }
//Customer 2 { "id": "2", "name": "Firstname", "surname": "Lastname", "contact": { "email": "firstname@yahoo.com", "phone": "+27987654321" } "location": { "city": "Sandton" } }
Unstructured data
Documents, images, audio, video and binary files can be considered unstructured data.
