A Streamlit-based internal tool designed to validate, clean, and split Excel data based on complex business rules related to Castle & DUNS decision logic.
This app ensures data quality, rule compliance, and standardized outputs for downstream processing.
β
Secure Login (Session-Based)
π€ Excel Upload & Sheet Validation
π Advanced Business Rule Validation
π§Ή Column Cleanup & Standardization
π PH / FH Record Analysis
β¬οΈ CSV Export (PH, FH, or Combined)
- Python
- Streamlit
- Pandas
- Excel (.xlsx) Processing
- Uses Streamlit session state for authentication
- Prevents access unless logged in
- Styled with custom CSS for centered UI
π Authentication logic can be extended to integrate real user validation
- Accepts only
.xlsxfiles - Blocks workflow until a valid file is uploaded
- Targets a specific required sheet:
- Displays:
- Row & column count
- Data preview (first few rows)
β Stops execution if the sheet is missing
Ensures these columns exist and contain no missing values:
DS_CastleSiteDUNS_DecisionDS_Research_Decisionprtl_hrchy_flg
- Allowed
DS_Research_Decision: Accept Castle MatchProposed different Site DUNSprtl_hrchy_flgmust be N- Additional mandatory DUNS fields if Proposed different Site DUNS
prtl_hrchy_flgmust be Y- Invalid if
DS_Research_Decisionis: Accept Castle MatchProposed different Site DUNS
Special handling for:
- No Site/Domestic/GU DUNS found
- Proposed GU / DU / Parent DUNS scenarios
All proposed DUNS fields must:
- Be numeric
- Be exactly 9 digits
Validated using reusable helper logic.
- Any rule violation:
- Shows error message
- Displays affected rows
- Stops execution immediately
βοΈ If all checks pass β success message displayed
Removes unnecessary metadata columns such as:
- Match grades
- Confidence codes
- DNB hierarchy fields
βοΈ Ensures final dataset contains exactly 29 columns
Displays:
- Cleaned dataset preview
- Row & column counts
- Summary counts:
- PH records (
prtl_hrchy_flg = Y) - FH records (
prtl_hrchy_flg = N)
Users can download:
- π PH data only
- π FH data only
- π Combined PH + FH data
βοΈ Custom file naming
βοΈ UTF-8 encoded CSV output
- Enforces strict business rules
- Prevents bad data from moving downstream
- Reduces manual validation effort
- Built for operational efficiency & audit readiness
- Data Quality Validation
- Castle vs Research Decision Reconciliation
- DUNS Hierarchy Processing
- Internal Ops & Analytics Teams
- Role-based authentication
- Logging & audit trail
- Config-driven rule management
- Deployment via Streamlit Cloud / Docker
KJ
Business / Data Analyst
Python β’ SQL β’ Power BI β’ Streamlit
β If this project helps you, consider giving it a star!