Kaushik066 commited on
Commit
b9bd11b
·
verified ·
1 Parent(s): e965ca9

Update app.py

Browse files
Files changed (1) hide show
  1. app.py +46 -17
app.py CHANGED
@@ -242,26 +242,55 @@ about_tab, app_tab = st.tabs(["About the app", "Face Recognition"])
242
  with about_tab:
243
  st.markdown(
244
  """
245
- ## Product Description/Objective
246
- An AI face recognition app for automated employee attendance uses advanced facial recognition technology to accurately and efficiently track employee attendance.
247
- By simply scanning employees' faces upon arrival and departure, the app eliminates the need for traditional timecards or biometric devices, reducing errors and fraud.
248
- It provides real-time attendance data, enhances workplace security, and streamlines HR processes for greater productivity and accuracy.
249
 
250
- ## How does it work ?
251
- Our app leverages Google's advanced **Vision Transformer (ViT)** architecture, trained on the **LFW (Labeled Faces in the Wild) dataset**, to deliver highly accurate employee attendance tracking through facial recognition.
252
- The AI model intelligently extracts distinct facial features and compares them to the stored data of registered employees. When an employee’s face is scanned, the model analyzes the key features, and a confidence score is generated.
253
- A high score indicates a match, confirming the employee’s identity and marking their attendance automatically. This seamless, secure process ensures precise tracking while minimizing errors and enhancing workplace efficiency.
254
 
255
- ### About the architecture.
256
- The Vision Transformer (ViT) is a deep learning architecture designed for image classification tasks, which applies transformer models—originally developed for natural language processing (NLP)to images.
257
- ViT divides an image into fixed-size non-overlapping patches. Each patch is flattened into a 1D vector, which is then linearly embedded into a higher-dimensional space. The patch embeddings are processed using a standard transformer encoder.
258
- This consists of layers with multi-head self-attention and feed-forward networks. The transformer is capable of learning global dependencies across the entire image.
259
- The Vision Transformer outperforms traditional convolutional neural networks (CNNs) on large-scale datasets, especially when provided with sufficient training data and computational resources.
260
 
261
- ### About the Dataset.
262
- Labeled Faces in the Wild (LFW) is a well-known dataset used primarily for evaluating face recognition algorithms. It consists of a collection of facial images of famous individuals from the web.
263
- LFW contains 13,000+ labeled images of 5,749 different individuals. The faces are collected from various sources, with images often showing individuals in different lighting, poses, and backgrounds.
264
- LFW is typically used for face verification and face recognition tasks. The goal is to determine if two images represent the same person or not.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
265
  """)
266
 
267
  # Gesture recognition Tab
 
242
  with about_tab:
243
  st.markdown(
244
  """
245
+ # 👁️‍🗨️ AI-Powered Face Recognition Attendance System
246
+ Effortless, Secure, and Accurate Attendance with Vision Transformer Technology
 
 
247
 
248
+ An intelligent, facial recognition-based attendance solution that redefines how organizations manage employee presence. By leveraging cutting-edge computer vision and AI, the app automates attendance tracking with speed, precision, and reliability—no timecards, no fingerprint scans, just a glance.
 
 
 
249
 
250
+ ## 🎯 Project Objective
251
+ To eliminate outdated, manual attendance methods with a seamless, contactless facial recognition system. Our solution not only improves the accuracy of attendance logs but also boosts workplace security and streamlines HR operationsall in real time.
252
+ Employees are simply scanned as they enter or leave the premises. Their attendance is automatically logged, reducing the risk of buddy punching, manual entry errors, and delays in record-keeping.
 
 
253
 
254
+ ## 🧠 How It Works: The AI in Action
255
+ At the core of this app is Google’s Vision Transformer (ViT) architecture, trained on the Labeled Faces in the Wild (LFW) dataset for robust, real-world face recognition.
256
+
257
+ - **Face Detection & Feature Extraction**
258
+ The model scans an employee’s face and extracts a high-dimensional representation of their unique features.
259
+
260
+ - **Identity Matching with Confidence Scoring**
261
+ The scanned features are compared to stored profiles. If the confidence score crosses a threshold, the model confirms the match and automatically marks attendance.
262
+
263
+ - **Real-Time Logging**
264
+ The app logs entry and exit times in real-time, providing live dashboards and attendance reports for HR and management.
265
+
266
+ ## 🏗️ About the Architecture: Vision Transformer (ViT)
267
+ The Vision Transformer (ViT) brings the power of transformer models—originally created for language—to the world of images. Here's how it works:
268
+
269
+ - An input image is split into fixed-size non-overlapping patches.
270
+ - Each patch is flattened and embedded into a higher-dimensional space.
271
+ - These embeddings are fed into a transformer encoder, which learns complex spatial and contextual relationships across the entire image using multi-head self-attention.
272
+ - ViT’s ability to capture global dependencies enables it to outperform traditional CNNs when trained on sufficient data.
273
+
274
+ This makes it ideal for high-accuracy face recognition in dynamic, real-world environments.
275
+
276
+ ## 📚 About the Dataset: Labeled Faces in the Wild (LFW)
277
+ To train the model, we used the renowned Labeled Faces in the Wild (LFW) dataset, consisting of 13,000+ facial images, 5,749 individuals, each shown in diverse lighting, angles, and backgrounds. Sourced from real-world photographs of public figures. Benchmark dataset for tasks like face verification and recognition. The diversity in LFW ensures our model is resilient to variations in appearance, making it highly reliable in real-world workplace scenarios.
278
+
279
+ ## ✅ Key Features
280
+ - Fast, contactless attendance logging
281
+ - High-security identity verification
282
+ - Real-time data and analytics
283
+ - Powered by state-of-the-art Vision Transformer architecture
284
+ - Eliminates manual records, reduces fraud, enhances efficiency
285
+
286
+ ## 👥 Use Cases
287
+ - Corporate Offices: Accurate time tracking and security for large workforces
288
+ - Factories & Warehouses: Contactless attendance in high-throughput environments
289
+ - Educational Institutions: Seamless student and staff attendance
290
+ - Healthcare & Public Services: Ensures hygienic, automated check-ins
291
+
292
+ ## 🚀 Future Scope
293
+ Looking ahead, we aim to integrate multi-face detection for group scanning, mask-aware recognition, and cross-location synchronization for distributed teams—all while preserving data privacy and security.
294
  """)
295
 
296
  # Gesture recognition Tab