Skip to content

Conversation

@Jasonzyt
Copy link

@Jasonzyt Jasonzyt commented Aug 24, 2025

πŸ”— Linked issue

Close #3511

❓ Type of change

  • πŸ“– Documentation (updates to the documentation or readme)
  • 🐞 Bug fix (a non-breaking change that fixes an issue)
  • πŸ‘Œ Enhancement (improving an existing functionality like performance)
  • ✨ New feature (a non-breaking change that adds functionality)
  • ⚠️ Breaking change (fix or feature that would cause existing functionality to change)

πŸ“š Description

Currently, CSV collections don’t behave as described in the documentation.
Instead of allowing each CSV file to contain multiple entries (as the documentation
states), the current implementation treats each CSV file as a single piece of content.

πŸ“ Checklist

  • I have linked an issue or discussion.
  • I have updated the documentation accordingly.

@vercel
Copy link

vercel bot commented Aug 24, 2025

@Jasonzyt is attempting to deploy a commit to the NuxtLabs Team on Vercel.

A member of the Team first needs to authorize it.

@pkg-pr-new
Copy link

pkg-pr-new bot commented Aug 24, 2025

npm i https://pkg.pr.new/@nuxt/content@3513

commit: 364b2d8

@Jasonzyt Jasonzyt marked this pull request as ready for review August 24, 2025 13:45
@farnabaz
Copy link
Member

Thanks for the PR @Jasonzyt
It shame that I've missed this changes in documentation update. Documentation is wrong.
But this behavior is not planned to support. Instead module could support this behavior only for single file CSV collections.

people: defineCollection({
    type: "data",
    source: "org/people.csv",
    schema: z.object({
      "name": z.string(),
      "email": z.string().email(),
    }),
  }),

It is important to create 1 to 1 and predictable mapping between files in content directory and documents in database. Splitting CSV files outside of collection source login, breaks this predictability.

@Jasonzyt
Copy link
Author

Jasonzyt commented Sep 12, 2025

In my case, I only need single file CSV support.
I agree with you, so documentation should be updated.

My PR can implement single CSV support, but tbh some logic needs to be optimized.
And I think CSV data type support (currently only string) should be added too.

@farnabaz
Copy link
Member

I'll update your PR,
we can test and improve

@farnabaz
Copy link
Member

@Jasonzyt Could you check and test the behavior?
We need to update docs to document both single file and multi file behaviors

const { queries, hash } = generateCollectionInsert(collection, parsedContent)
list.push([key, queries, hash])
}
const { queries, hash } = generateCollectionInsert(collection, parsedContent)
Copy link
Author

@Jasonzyt Jasonzyt Sep 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m not sure why you reverted this.
Since CSV files contain multiple rows, and each row should be treated as an independent ParsedContent, they require a special handling process. Each CSV file should not be processed as a single ParsedContent.

In contrast, formats like Markdown/JSON/YML contain only one ParsedContent per file, so they must be handled differently.

And the facts prove this point: after installing the latest package, errors occurred.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CSV and other formats must be handled seperately. Maybe we can find a better way to do this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need to do this, Content now has defineCSVSource which reads csv file and generate a document for each row

@Jasonzyt
Copy link
Author

Jasonzyt commented Sep 12, 2025

And the facts prove this point: after installing the latest package, errors occurred.

In my case

const { data } = await useAsyncData("albumsMeta", () => {
  return queryCollection("albumsMeta").order("updated", "DESC").all();
});

returns

[ { id: 'albumsMeta/albums-meta/albums-meta.csv',                                                                                                                   
    cover: null,
    description: null,
    extension: 'csv',
    meta: { path: '/albums-meta/albums-meta', body: [Array], title: 'Albums Meta' },
    name: null,
    stem: 'albums-meta/albums-meta',
    updated: null,
    urlFormat: null,
    __hash__: 'qZyKG0MlhJb1M0gDHO6L8sGT9FxtlwB_nASlz1K0ooM' } ]

This is exactly the bug mentioned by #3511

@farnabaz
Copy link
Member

@Jasonzyt could you share your reproduction repository? or provide minimal one?

@Jasonzyt
Copy link
Author

Here's my repo: https://github.com/Jasonzyt/gallery

@farnabaz
Copy link
Member

@Jasonzyt There was a mistake in a string, should be good now. Try with https://pkg.pr.new/@nuxt/content@0f6d610

@tazim404
Copy link

tazim404 commented Sep 12, 2025

I installed npm i https://pkg.pr.new/@nuxt/content@3513 and it worked fine. But after few hours when i am installing this package in my different project it give the same error as earlier.

Edit:
And after installing npm i https://pkg.pr.new/@nuxt/content@0f6d610 its giving the same problem

@tazim404
Copy link

Please solve this problem urgently. I have projects to submit

@Jasonzyt
Copy link
Author

@tazim404
Try https://pkg.pr.new/@nuxt/content@c1cefd2 ?
c1cefd2 works for me

but the later changes broke it

@Jasonzyt
Copy link
Author

Jasonzyt commented Sep 13, 2025

@Jasonzyt There was a mistake in a string, should be good now. Try with https://pkg.pr.new/@nuxt/content@0f6d610

@farnabaz Still the same bug behaved like #3511

@farnabaz
Copy link
Member

@Jasonzyt I checked with your projects and it works as expected with single file sources.
Screenshot 2025-09-15 at 11 59 05

try with npm i https://pkg.pr.new/@nuxt/content@f790439

Note: I had to remove styles and some other deps because I was facing issue with oxc in you repo
Note2: It is not recommended to use useAsyncData outside of script setup. this utility is designed to use directly in script

@Jasonzyt
Copy link
Author

Jasonzyt commented Oct 26, 2025

Okay, I found the problem:
You can't use **.csv in the content.config.ts

export default defineContentConfig({
  collections: {
    albumsMeta: defineCollection({
      type: "data",
+      source: "test.csv",
-      source: "test/**.csv",
      schema: z.object({
        id: z.string(),
        name: z.string(),
        description: z.string(),
        cover: z.string(),
        updated: z.string(),
        urlFormat: z.string(),
      }),
    }),
    ...defineAlbumCollections(),
  },
});

Now everything works well!

@farnabaz Thanks for the review! I think we should complete docs before merging

@Jasonzyt
Copy link
Author

Jasonzyt commented Oct 26, 2025

@farnabaz It seems there's some bug parsing multiple .csv files
Single .csv file works well
The currently behavior is not matched with the newly updated docs.

Jasonzyt added a commit to Jasonzyt/gallery that referenced this pull request Oct 26, 2025
@Jasonzyt Jasonzyt mentioned this pull request Nov 10, 2025
@Jasonzyt
Copy link
Author

@farnabaz Okay now I understand everything. Sry for my misunderstand of your newly updated docs.
TBH some code is really confusing. I spent hours reading it and finally figured it out.
I just reorganized the docs and make it clearer. Now I think the PR is ready to merge!
Thanks for your work.

Comment on lines +134 to +148
return new Promise((resolve) => {
const csvKeys: string[] = []
let count = 0
createReadStream(join(resolvedSource.cwd, fixed, keys[0]!))
.on('data', function (chunk) {
for (let i = 0; i < chunk.length; i += 1)
if (chunk[i] == 10) {
if (count > 0) { // count === 0 is CSV header row and should not be included
csvKeys.push(`${keys[0]}#${count}`)
}
count += 1
}
})
.on('end', () => resolve(csvKeys))
})
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CSV files without trailing newlines will have their last data row missing from the collection. The getKeys function only generates row keys when it encounters newline characters, so the final row is skipped if the file doesn't end with a newline.

View Details
πŸ“ Patch Details
diff --git a/src/utils/source.ts b/src/utils/source.ts
index 801f27c0..2fd55da6 100644
--- a/src/utils/source.ts
+++ b/src/utils/source.ts
@@ -134,17 +134,26 @@ export function defineCSVSource(source: CollectionSource): ResolvedCollectionSou
       return new Promise((resolve) => {
         const csvKeys: string[] = []
         let count = 0
+        let lastByteWasNewline = true
         createReadStream(join(resolvedSource.cwd, fixed, keys[0]!))
           .on('data', function (chunk) {
-            for (let i = 0; i < chunk.length; i += 1)
+            for (let i = 0; i < chunk.length; i += 1) {
+              lastByteWasNewline = (chunk[i] == 10)
               if (chunk[i] == 10) {
                 if (count > 0) { // count === 0 is CSV header row and should not be included
                   csvKeys.push(`${keys[0]}#${count}`)
                 }
                 count += 1
               }
+            }
+          })
+          .on('end', () => {
+            // If file doesn't end with newline and we have at least one data row, add the last row
+            if (!lastByteWasNewline && count > 0) {
+              csvKeys.push(`${keys[0]}#${count}`)
+            }
+            resolve(csvKeys)
           })
-          .on('end', () => resolve(csvKeys))
       })
     },
     getItem: async (key) => {

Analysis

CSV files without trailing newlines have their last row missing from collection

What fails: In defineCSVSource() function in src/utils/source.ts, the getKeys implementation only generates row keys when it encounters newline characters. This causes the final data row to be skipped if the CSV file doesn't end with a newline character.

How to reproduce:

# Create a CSV file without a trailing newline
echo -n "name,email\nJohn,[email protected]\nJane,[email protected]" > test.csv

# Use defineCSVSource to read the keys
# Expected: ['test.csv#1', 'test.csv#2']
# Actual: ['test.csv#1'] (Jane's row is missing)

Result: The getKeys() function returns only 1 row key instead of 2. When the file has 2 data rows but no trailing newline, only the first row is accessible. The second row is silently dropped from the collection.

Expected behavior: Both rows should be included in the collection regardless of whether the file ends with a newline. CSV files without trailing newlines are common and are valid per RFC 4180 (CSV specification).

Root cause: The algorithm increments a counter each time it encounters a newline (byte value 10), but it only pushes a key when the counter is > 0. If the file doesn't end with a newline, the stream ends without triggering the final key generation for the last row.

Fix: Track whether the last byte processed was a newline character (lastByteWasNewline variable). In the 'end' event handler, if the file doesn't end with a newline and we have at least one data row (count > 0), push an additional key for the final incomplete row.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

trouble loading csv

3 participants