Lucene.NET Exception on Azure - segments.* file not found in AzureDirectory
Issue context
Environment
.NET 6.0
Lucene.Net 4.8.0-beta0016
Lucene.Net.QueryParser 4.8.0-beta16
Lucene.Net.Store.Azure 4.8.0-beta16 (I upgraded to beta16 manually to be consistent)
Exception
When querying the index stored in Azure Blob Storage after modification, the following exception is thrown:
Error occurred while searching keywords: Lucene.Net.Index.IndexNotFoundException: no segments* file found in AzureDirectory
More details about the issue
When the index is created for the first time, the following files are generated:
When querying the indexes, there are no issues.
A second run modifies the index to delete all documents and re-add them and add a new document to the index. The index directory in blob storage now looks like the following screenshot:
When querying the index, the exception is thrown:
Error occurred while searching keywords: Lucene.Net.Index.IndexNotFoundException: no segments* file found in AzureDirectory@39b5dcc lockFactory=NativeFSLockFactory@: files: [] at Lucene.Net.Index.SegmentInfos.FindSegmentsFile.Run(IndexCommit commit) at Lucene.Net.Index.StandardDirectoryReader.Open(Directory directory, IndexCommit commit, Int32 termInfosIndexDivisor) at Lucene.Net.Index.DirectoryReader.Open(Directory directory)
Root issue
segments.gen
file was deleted when modifying indexes. The upload failed when deleting it as the file already exists in Azure. Thus to fix this problem, we just need to to set overwrite
parameter to true
.
When I looked into the AzureDirectory lib I am using, it has not implemented all the functions properly:
protected override void Dispose(bool disposing) { _fileMutex.WaitOne(); try { // make sure it's all written out _indexOutput.Flush(); long originalLength = _indexOutput.Length; _indexOutput.Dispose(); using (var blobStream = new StreamInput(CacheDirectory.OpenInput(_name, IOContext.DEFAULT))) { // push the blobStream up to the cloud _blob.Upload(blobStream, overwrite: true); // set the metadata with the original index file properties //_blob.SetMetadata(); Debug.WriteLine($"{_azureDirectory.Name} PUT {_name} bytes to {blobStream.Length} in cloud"); } #if FULLDEBUG Debug.WriteLine($"{_azureDirectory.Name} CLOSED WRITESTREAM {_name}"); #endif // clean up _indexOutput = null; _blobContainer = null; _blob = null; GC.SuppressFinalize(this); } finally { _fileMutex.ReleaseMutex(); } }