Mohammed Mohammed Ali's picture
21 3

Mohammed Mohammed Ali

MohammedEltoum
ยท

AI & ML interests

None yet

Recent Activity

Organizations

ARIC's profile picture

MohammedEltoum's activity

reacted to Jaward's post with โค๏ธ 2 days ago
reacted to as-cle-bert's post with โค๏ธ 6 days ago
view post
Post
1814
One of the biggest challenges I've been facing since I started developing [๐๐๐Ÿ๐ˆ๐ญ๐ƒ๐จ๐ฐ๐ง](https://github.com/AstraBert/PdfItDown) was handling correctly the conversion of files like Excel sheets and CSVs: table conversion was bad and messy, almost unusable for downstream tasks๐Ÿซฃ

That's why today I'm excited to introduce ๐ซ๐ž๐š๐๐ž๐ซ๐ฌ, the new feature of PdfItDown v1.4.0!๐ŸŽ‰

With ๐˜ณ๐˜ฆ๐˜ข๐˜ฅ๐˜ฆ๐˜ณ๐˜ด, you can choose among three (for now๐Ÿ‘€) flavors of text extraction and conversion to PDF:

- ๐——๐—ผ๐—ฐ๐—น๐—ถ๐—ป๐—ด, which does a fantastic work with presentations, spreadsheets and word documents๐Ÿฆ†

- ๐—Ÿ๐—น๐—ฎ๐—บ๐—ฎ๐—ฃ๐—ฎ๐—ฟ๐˜€๐—ฒ by LlamaIndex, suitable for more complex and articulated documents, with mixture of texts, images and tables๐Ÿฆ™

- ๐— ๐—ฎ๐—ฟ๐—ธ๐—œ๐˜๐——๐—ผ๐˜„๐—ป by Microsoft, not the best at handling highly structured documents, by extremly flexible in terms of input file format (it can even convert XML, JSON and ZIP files!)โœ’๏ธ

You can use this new feature in your python scripts (check the attached code snippet!๐Ÿ˜‰) and in the command line interface as well!๐Ÿ

Have fun and don't forget to star the repo on GitHub โžก๏ธ https://github.com/AstraBert/PdfItDown
reacted to hesamation's post with โค๏ธ 28 days ago
view post
Post
8229
Google published a 69-page whitepaper on Prompt Engineering and its best practices, a must-read if you are using LLMs in production:
> zero-shot, one-shot, few-shot
> system prompting
> chain-of-thought (CoT)
> ReAct

LINK: https://www.kaggle.com/whitepaper-prompt-engineering
> code prompting
> best practices
reacted to prithivMLmods's post with ๐Ÿ‘ about 1 month ago
view post
Post
2635
Dropping Downstream tasks using newly initialized parameters and weights ([classifier.bias & weights]) support domain-specific ๐—ถ๐—บ๐—ฎ๐—ด๐—ฒ ๐—ฐ๐—น๐—ฎ๐˜€๐˜€๐—ถ๐—ณ๐—ถ๐—ฐ๐—ฎ๐˜๐—ถ๐—ผ๐—ป. Based on siglip2-base-patch16-224 and DomainNet (single-domain, multi-source adaptation), with Fashion-MNIST & More for experimental testing. ๐Ÿงคโ˜„๏ธ

Fashion-Mnist : prithivMLmods/Fashion-Mnist-SigLIP2
Mnist-Digits : prithivMLmods/Mnist-Digits-SigLIP2
Multisource-121 : prithivMLmods/Multisource-121-DomainNet
Painting-126 : prithivMLmods/Painting-126-DomainNet
Sketch-126 : prithivMLmods/Sketch-126-DomainNet
Clipart-126 : prithivMLmods/Clipart-126-DomainNet

Models are trained with different parameter settings for experimental purposes only, with the intent of further development. Refer to the model page below for instructions on running it with Transformers ๐Ÿค—.

Collection : prithivMLmods/domainnet-0324-67e0e3c934c03cc40c6c8782

Citations : SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features https://arxiv.org/pdf/2502.14786 & Moment Matching for Multi-Source Domain Adaptation : https://arxiv.org/pdf/1812.01754

reacted to hanzla's post with ๐Ÿ‘ about 1 month ago
view post
Post
2026
Hi community,

Few days back, I posted about my ongoing research on making reasoning mamba models and I found great insights from the community.

Today, I am announcing an update to the model weights. With newer checkpoints, the Falcon3 Mamba R1 model now outperforms very large transformer based LLMs (including Gemini) for Formal Logic questions of MMLU. It scores 60% on formal logic which is considered a tough subset of questions in MMLU.

I would highly appreciate your insights and suggestions on this new checkpoint.

Model Repo: hanzla/Falcon3-Mamba-R1-v0

Chat space: hanzla/Falcon3MambaReasoner
reacted to rizavelioglu's post with โค๏ธ 2 months ago
view post
Post
3261
Comparing reconstruction quality of various VAEs with an interactive demo
rizavelioglu/vae-comparison
  • 1 reply
ยท
reacted to openfree's post with โค๏ธ 2 months ago
view post
Post
7898
Datasets Convertor ๐Ÿš€

openfree/Datasets-Convertor

Welcome to Datasets Convertor, the cutting-edge solution engineered for seamless and efficient data format conversion. Designed with both data professionals and enthusiasts in mind, our tool simplifies the transformation process between CSV, Parquet, and JSONL, XLS file formats, ensuring that your data is always in the right shape for your next analytical or development challenge. ๐Ÿ’ปโœจ

Why Choose Datasets Convertor?
In todayโ€™s data-driven world, managing and converting large datasets can be a daunting task. Our converter is built on top of robust technologies like Pandas and Gradio, delivering reliable performance with a modern, intuitive interface. Whether youโ€™re a data scientist, analyst, or developer, Datasets Convertor empowers you to effortlessly switch between formats while maintaining data integrity and optimizing storage.

Key Features and Capabilities:
CSV โ‡† Parquet Conversion:
Easily transform your CSV files into the highly efficient Parquet format and vice versa. Parquetโ€™s columnar storage not only reduces file size but also accelerates query performanceโ€”a critical advantage for big data analytics. ๐Ÿ”„๐Ÿ“‚

CSV to JSONL Conversion:
Convert CSV files to JSONL (newline-delimited JSON) to facilitate efficient, line-by-line data processing. This format is particularly useful for streaming data applications, logging systems, and scenarios where incremental data processing is required. Each CSV row is meticulously converted into an individual JSON record, preserving all the metadata and ensuring compatibility with modern data pipelines. ๐Ÿ“„โžก๏ธ๐Ÿ“

Parquet to JSONL Conversion:
For those working with Parquet files, our tool offers a streamlined conversion to JSONL.

Parquet to XLS Conversion.