最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

java - Folder Size is too Large of Lucene Documents - Stack Overflow

programmeradmin0浏览0评论

I am using the following code to add documents to a Lucene index. I have indexed 23,425 documents, but the folder where the index is stored has a size of 447.4 MB. In contrast, when storing the same data in a Parquet file with the same 23,425 records, the file size is only 625 KB. The folder size for the Lucene index seems excessively large. Could someone help identify why this is happening and how to optimize it? Below is the code I am using:

        MMapDirectory indexDirectory = new MMapDirectory(Paths.get(directory));
        // Configure the IndexWriter with an analyzer
        StandardAnalyzer analyzer = new StandardAnalyzer();
        IndexWriterConfig config = new IndexWriterConfig(analyzer);
        IndexWriter indexWriter = new IndexWriter(indexDirectory, config);

for (Map.Entry<String, OperationAggregation> entry : operations.entrySet())
  {
           Document doc1 = new Document();
           doc1.add(new StringField("namespace", namespace, Store.YES));
           doc1.add(new StringField("type", "operations", Store.YES));
           doc1.add(new StringField("data", entry.getKey(), Store.YES));
           doc1.add(new StringField("serviceName",entry.getValue().getServiceName(),
                                                    Store.YES));
           List<AggregationAttribute> attributes =
                                            entry.getValue().getOperationAttributes();
             for (int i = 0; i < attributes.size(); i++) 
             {
                 doc1.add(new StoredField(attributes.get(i).getName(),
                               String.valueOf(attributes.get(i).getValue())));
              }
               try { docCount.getAndIncrement();
                     ndexWriter.addDocument(doc1);
                  } catch (IOException e) {
                     logger.error("Error while adding document to index", e);
                }
    }
    indexWritermit();
    indexWriter.close();
发布评论

评论列表(0)

  1. 暂无评论