Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hard Error/Force Quit on files with parentheses - ( ) #833

Open
codejp3 opened this issue Apr 16, 2024 · 2 comments
Open

Hard Error/Force Quit on files with parentheses - ( ) #833

codejp3 opened this issue Apr 16, 2024 · 2 comments

Comments

@codejp3
Copy link

codejp3 commented Apr 16, 2024

Ran into an issue with the FTS initial index on my files.

I had a directory of files that all ended with "(YEAR)" - opening-parenthesis 4-digit-year closing-parenthesis - as the last part of the filename (e.g - "Training Manual (2001).pdf"). Whenever it got to one of those files, the initial index service would seem to freeze. Even after letting it run for more than a day on a single file it never progressed any further and would eventually not respond. The index would eventually force quit itself.

After renaming the files without parentheses (e.g. - "Training Manual 2001.pdf"), they indexed fine.

Before renaming, there was no error saved to the index log. There was nothing indicating that the parentheses caused the error, but I totally guessed that was the issue, and guessed right.

Before renaming:

┌─ Indexing  ────
│ Action: indexDocument
│ Provider: Files                Account: USERNAME
│ Document: 3067515
│ Info: application/pdf
│ Title: Path/To/Training Manual (2001).pdf
│ Content size: 7852544
│ Chunk:   1101/1277
│ Progress:    964/6511
└──
┌─ Results ────
│ Result:    539/539
│ Index: files:3067542
│ Status: ok
│ Message: {"_index":"nc_indexnextcloud","_id":"files:3067542","_version":1,"result":"created","_shards":{"total":2,"successful":1,"failed":0},"_seq_no"
│ :41204,"_primary_term":2}
│ 
└──
┌─ Errors ────
│ Error:      0/0
│ Index: 
│ Exception: 
│ Message: 
│ 
│ 
└──

## x:first result ## c/v:prec/next result ## b:last result
## f:first error ## h/j:prec/next error ## d:delete error ## l:last error
## q:quit ## p:pause 
Force Quit

Note the "Force Quit" above

occ fulltextsearch:document:status -u USERNAME files 3067515

In DocumentStatus.php line 199:
                                                               
  Specify a valid status: IGNORE, INDEX, DONE, REMOVE, FAILED  
occ fulltextsearch:document:provider -- USERNAME files 3067515

(did not respond and produced no output)

After Renaming (and indexing successfully):

occ fulltextsearch:document:status -u USERNAME -j -- files 3067515

{
    "ownerId": "USERNAME",
    "providerId": "files",
    "collection": "local",
    "source": "files_local",
    "documentId": "3067515",
    "lastIndex": 1713212033,
    "errors": [],
    "errorCount": 0,
    "status": 1,
    "options": {
        "_files_pdf": "1",
        "_files_local": "1"
    }
}

{
    "id": "3067515",
    "providerId": "files",
    "access": {
        "ownerId": "USERNAME",
        "viewerId": "",
        "users": [],
        "groups": [],
        "circles": [],
        "links": []
    },
    "modifiedTime": 1485997542,
    "title": "Path/To/Training Manual 2001.pdf",
    "link": "http:\/\/localhost\/index.php\/f\/3067515",
    "index": {
        "ownerId": "USERNAME",
        "providerId": "files",
        "collection": "",
        "source": "files_local",
        "documentId": "3067515",
        "lastIndex": 0,
        "errors": [],
        "errorCount": 0,
        "status": 28,
        "options": {
            "_files_pdf": "1",
            "_files_local": "1"
        }
    },
    "source": "files_local",
    "info": {
        "share_names": {
            "USERNAME": "Path\/To\/Training Manual 2001.pdf"
        }
    },
    "hash": "",
    "contentSize": 7852544,
    "tags": [],
    "metatags": [
        "files_local"
    ],
    "subtags": [],
    "more": {
        "creationTime": 1713211957,
        "accessedTime": 1713210132
    },
    "excerpts": [],
    "score": ""
}

@codejp3
Copy link
Author

codejp3 commented Apr 27, 2024

I will follow up and state that this is inconsistent. Most files with parenthesis in the name do get indexed properly. But a few have not. removing the parenthesis from the file name has resolved the index issue for those files. I wish I had more info to offer on this.

@4y55
Copy link

4y55 commented Jun 24, 2024

Here I have an error in files without parentheses.
I have no idea what is happening. The PDF is not locked, it is not broken or anything like that.

docker exec --user www-data nextcloud-aio-nextcloud php occ fulltextsearch:index

Options: []
Memory: 45 MB
┌─ Indexing  ────
│ Action: indexDocument
│ Provider: Files                Account: admin
│ Document: 118807
│ Info: application/pdf
│ Title: 01-Livros/Direito Processual Penal/Direito Processual Penal - Renato Brasileiro 2020.pdf
│ Content size: 19235132
│ Chunk:     18/127
│ Progress:      7/8
└──
┌─ Results ────
│ Result:      1/1
│ Index: files:2
│ Status: ok
│ Message: {"_index":"nextcloud-aio","_id":"files:2","_version":1,"result":"no
│ op","_shards":{"total":0,"successful":0,"failed":0},"_seq_no":6,"_primary_ter
│ m":1}
└──
┌─ Errors ────
│ Error:     28/28
│ Index: files:21882
│ Exception: OCA\FullTextSearch_Elasticsearch\Vendor\Elastic\Elasticsearch\Exception\ClientResponseException
│ Message: unknown error
│ 
│ 
└──
## x:first result ## c/v:prec/next result ## b:last result
## f:first error ## h/j:prec/next error ## d:delete error ## l:last error
## q:quit ## p:pause 

In ElasticSearchPlatform.php line 202:
                                                              
  [OCA\FullTextSearch\Exceptions\PlatformTemporaryException]  
                                                              
fulltextsearch:index [--output [OUTPUT]] [-r|--no-readline] [--] [<options>]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants