Infraspeak’s data lake solution is a product that allows unparalleled access to Infraspeak’s data, to enhance customer's scalability and flexibility, increase the team’s performance, improve operations efficiency, and allow a centralized data stream.
Infraspeak Data lake has a simplified architecture that enables a new generation of data-related use cases. It allows you to choose how to ingest your Infraspeak data, is compatible with any tech stack and gives you access to your data backups.
Data Lake Structure:
In a simple, and summarized, way, Infraspeak data lake has the following structure:
For each Infraspeak platform entity, a directory will be generated which contains folders with the .csv files resulting from the ETL process.
The files will be named with a timestamp and the corresponding object.
Infraspeak customers can retrieve the S3 files via AWS API, with the IAM credentials shared with them.
By using the Athena connector for any analytic tools, the customer can select which tables (from the available data) they want to import.
Athena will have a workgroup for each customer, defining the specific output folder to store the query results in S3.
Data Lake tables:
In Infraspeak Data Lake, you will find data from the following objects:
Data Lake objects/tables (please consider that only the data lake “__view “tables should be used for analytics):
buy_order → Purchases (of type Material and Services)
buy_order_material → Material Purchase Lines
buy_order_registry → Buy order’s actions registry
buy_order_service → Service Purchase Lines
category → Maintenance Categories
category_meterings → Maintenance Categories meterings
category_meterings_catalog → Catalog options of catalog meterings
characteristic →Maintenance Categories characteristics
client → Clients
client_operator → Client’s associated users
cost_center → Cost Centers
element → Assets
element_characteristic → Asset characteristics
element_economic → Asset economic data
element_other cost → Other Costs registry in Assets
element_registry → Asset registry
event → Work Order and Planned Job Orders events
event_registry → Event Registry
failure → Work Orders
failure_element → Work Order’s Assets
failure_other_cost → Other costs registry in Work Orders
failure_pause_reason → Work Order Pause reasons
failure_priority → Available Work Order priorities
failure_sla → Work Orders SLA
failure_sla_rule → Work Order SLA rules
failure_sla_rule_operator → Work Order SLA notification rules per Operator
failure_sla_rule_registry →Work Order SLA rules registry
gatekeeper → Gatekeepers available/configured
gatekeeper_answer _registry → Registry of answered gatekeepers
gatekeeper_question → Questions available in the Gatekeeper
gatekeeper_question_answer → Registry of questions answered in each gatekeeper
intervention → Maintenance Categorias interventions
intervention_procedure → Intervention’s tasks
local_operator → Location’s associated users
location → Locations
location_building_info → Buildings
maintenance_procedure → Categories configured tasks
maintenance_procedure_metering → Tasks’ associated measurements
material → Materials
material_warehouse → Materials’ warehouse association
metering_registry → Metering registry
operator → Users
operator_activity → Operator activity registry
operator_technical_skill → Operator’s associated technical skills
other_cost → Other Costs
problem → Work Order areas and types
problem_technical_skill → Work Order area’s associated technical skills
problem_responsible → Work Order area’s associated responsibles
quote → Quotes
quote_line → Quote Lines
quote_request → Quote requests
quote_request_line → Quote request Lines
scheduled_work → Planned Job Orders
schedule_work_other_cost → Other Costs registry in Planned Job Orders
sell_order → Sell Orders
sell_order_line → Sell Order Lines
stock → Stock registry
stock_movement → Stock movement registry
supplier → Suppliers
technical_skill → Technical Skills
warehouse → Warehouses
work → Planned Jobs
work_intervention → Planned Jobs interventions
work_location → Planned Jobs’ locations
work_responsible → Planned Job responsibles
work_sla_rule → Planned Jobs SLA rules
work_sla_rule_operator → Planned Jobs SLA notification rules per Operator
work_type → Planned Job types
The data will be provided in a sequential manner, the timing availability of the data will be according to the plan contractualized.
Table Repository
The Data Lake repository offers crucial information about the available tables and their correlations. The access to the repository is automatically granted via an email invitation once the Data Lake is configured.
Navigating the Repository
On the main page, you'll find a comprehensive list of tables available in the Data Lake. We recommend reviewing this list. To view the details of a specific table, simply click directly on the table name in the list on the left side of the page. The documentation also includes table correlation diagrams.
The specific diagram for each table can be viewed directly on that table's page, as explained above. To access the general diagram showing all tables, click on "Diagram" at the top center of the repository's first page.