Sujal Bhat commited on
Commit
323d65b
·
1 Parent(s): 3ae573f

colored deliverables

Browse files
Files changed (1) hide show
  1. deliverables/Task1.md +6 -3
deliverables/Task1.md CHANGED
@@ -16,18 +16,21 @@ Hint: Create a list of potential questions that people are likely to ask!
16
  ✅ Deliverables:
17
 
18
  1. Describe the default chunking strategy that you will use.
19
-
20
  The default chunking strategy used is a combination of size-based splitting and thematic categorization.
21
  This strategy uses RecursiveCharacterTextSplitter with a chunk size of 1000 characters and an overlap of 200 characters. It then categorizes these chunks based on predefined themes.
 
22
 
23
  2. Articulate a chunking strategy that you would also like to test out.
24
 
 
25
  A pure size-based chunking strategy without thematic categorization. This would involve splitting the text into fixed-size chunks without attempting to categorize them based on themes.
 
26
 
27
 
28
 
29
  3. Describe how and why you made these decisions
30
-
31
  The default strategy was chosen for its simplicity and efficiency:
32
 
33
  * Size-based splitting (1000 characters) ensures manageable chunk sizes for processing and embedding.
@@ -41,7 +44,7 @@ The alternative pure size-based strategy:
41
  * Is simpler to implement and doesn't rely on predefined themes.
42
  * May split semantic units, potentially affecting the coherence of individual chunks.'
43
  * Could be more comprehensive, including all parts of the document regardless of theme.
44
-
45
 
46
 
47
 
 
16
  ✅ Deliverables:
17
 
18
  1. Describe the default chunking strategy that you will use.
19
+ <div style="color: green;">
20
  The default chunking strategy used is a combination of size-based splitting and thematic categorization.
21
  This strategy uses RecursiveCharacterTextSplitter with a chunk size of 1000 characters and an overlap of 200 characters. It then categorizes these chunks based on predefined themes.
22
+ </div>
23
 
24
  2. Articulate a chunking strategy that you would also like to test out.
25
 
26
+ <div style="color: green;">
27
  A pure size-based chunking strategy without thematic categorization. This would involve splitting the text into fixed-size chunks without attempting to categorize them based on themes.
28
+ </div>
29
 
30
 
31
 
32
  3. Describe how and why you made these decisions
33
+ <div style="color: green;">
34
  The default strategy was chosen for its simplicity and efficiency:
35
 
36
  * Size-based splitting (1000 characters) ensures manageable chunk sizes for processing and embedding.
 
44
  * Is simpler to implement and doesn't rely on predefined themes.
45
  * May split semantic units, potentially affecting the coherence of individual chunks.'
46
  * Could be more comprehensive, including all parts of the document regardless of theme.
47
+ </div>
48
 
49
 
50