Design and Implementation of Automotive Component Production Database

Finding someone that can help me with my data engineering project. Objective The main objective of this section is to design, implement, and query a relational database system for automotive component production. The database must manage machines, parts, work orders, production runs, process settings (including setting parameters and output parameters from the provided Excel), and other data necessary for manufacturing management. Learners will implement the physical database in PostgreSQL and use SQL/Python for data querying and simple data integration. Data dictionary (exclude the process parameters explained in the previous section) Column Description / Notes work_order_id Work order identifier. One work order can cover multiple parts and multiple production runs. production_run_id Unique production-run identifier (often derived from work_order_id + machine_id + run sequence). Primary key candidate at run level. Start Start timestamp of the production run window. End End timestamp of the production run window; should be ≥ Start. shift_code Shift label (e.g., A/B/C) for the run. operator_id Operator identifier responsible for the run. In a normalised schema, references an Operator master. machine_id Stable machine identifier. In a normalised schema, foreign key to Machine. machine_name Human-readable machine label; functionally depends on machine_id. Machine details Free-text machine label from the source file; retained for traceability (may duplicate machine_name). part_id Stable part identifier. In a normalised schema, foreign key to Part. part_name Human-readable part label; functionally depends on part_id. Product details Free-text product/part label from the source file; retained for traceability (may duplicate part_name). planned_qty Planned quantity for this run (units). good_qty Non-defective quantity produced in the run (units). Should satisfy 0 ≤ good_qty ≤ planned_qty. defect_qty Defective quantity produced in the run (units). Typically planned_qty = good_qty + defect_qty. defect_type Defect category for the run: one of Nil, Burrs, Surface Scratch, Drill Oversize, Others. Use Nil when defect_qty = 0. defect* Binary flag indicating presence of defects; derived from defect_qty (>0 → 1, else 0). power_kwh Estimated power consumption (kWh) over the run window; usually derived from duration and load proxies. unit_cost_sgd Unit manufacturing cost placeholder for exercises. run_cost_sgd Run-level cost; typically planned_qty * unit_cost_sgd (derived). Basic assumptions One machine can produce multiple parts over time One work order can cover multiple parts Requirements Database design: Propose a relational schema for the production management system. Include any additional reasonable assumptions. Database implementation: Implement the data schema into PostgreSQL. Data insertion: Populate the database with the provided data set. Data query: Write SQL scripts to query and extract specific data from the database and perform the following queries: a. List all work orders handled by a specific machine. b. Calculate the total power consumption by a specic machine c. Find the machine that has most number of defects Python integration: Write Python code to connect to the database and perform basic functions. Instructions - Project Steps Conceptual design (ERD) Analyze the provided dataset to identify the entities, attributes, relationships and functional dependencies Draw an Entity-Relationship Diagram (ERD) to represent the identified entities and their relationships. State your assumptions. Logical design & normalization Identify data constraints & transform the ERD into relational schemas. Normalize the dataset to at least 3rd Normal Form (3NF). Explain how redundancy/inconsistency is removed. Physical design in PosgreSQL Implement the data schema in PostgreSQL. Ensure relationships are implemented correctly using foreign key constraints. Data insertion Insert the provided data into the tables based on the normalized dataset using either way: Use SQL INSERT statements for manual entry. Use pgAdmin to perform a bulk data import by uploading .csv files. Note: If you choose this approach, make sure to include the .csv files as part of your submission. Data querying using SQL Provide the SQL scripts for (a), (b), (c). Include query results (screenshots ). a. List all work orders handled by a specific machine. b. Calculate the total power consumption by a specific machine c. Find the machine that has most number of defects Python integration with PostgreSQL: Provide a short Python script that connects to PostgreSQL inserts a machine into the database changes the quantity of work order Appendix PDF format ERD and relational schemas with assumptions. Normalisation steps (how you reached 3NF). SQL scripts and clear screenshots of tables populated in PostgreSQL. SQL queries and outputs Python code with evidence of successful DB interaction. Any data-cleaning notes and mapping decisions for process-setting columns. [If gAI was used] Link to gAI conversation/s showing all the prompts and corresponding outputs from the generative AI system. Source files sql (schema + insert/import + queries) .ipynb (Python) Any CSVs used for import

Python

Реєстрація