最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

python - why second for loop do not start from beginning - Stack Overflow

programmeradmin1浏览0评论

I have pairs of PDF files that I need to merge (each pair). The code works fine, but I don't understand why the second loop always starts with the second file of the pair. I thought it would start from the beginning. I was look at this Python - Why does the second for loop start from the second row, but still stuck. Is it same problem? I dont think so. I think it is somewhere here:

fileList1 = Path(folder).glob('*text1.pdf')
fileList2 = Path(folder).glob('*text2.pdf')

start code

#! python3
# merge PDF files based on a /numeric/string/ code in file name
# example1 r01_cz14_028_city_text1.PDF
# example2 r01_cz14_028_city_text2.PDF
import os
from pypdf import PdfWriter
from pathlib import Path

def mergeFiles(folder):
    
    folder = os.path.abspath(folder)   # make sure folder is absolute path

    # make a lists
    fileList1 = Path(folder).glob('*text1.pdf')
    fileList2 = Path(folder).glob('*text2.pdf')

    outputName = ''
    outputFolderPath = folder + '\\somefolder'
    p = Path(outputFolderPath)
    if not p.exists():
            os.makedirs(outputFolderPath)
    n = 0
    folderLenght = len(folder)+1
    fileNameArea = folderLenght+12
    print(f'Adding files in {outputFolderPath}...')


    for filename1 in fileList1:
        n += 1
        
        # match test                   = SECOND LOOP
        for filename2 in fileList2:
            string1 = str(filename1)[folderLenght:fileNameArea].lower()
            string2 = str(filename2)[folderLenght:fileNameArea].lower()

            # Add choosen files in this folder to the PDF file by string.    
            if string1 == string2:
                outputName = 'D' + \
                str(filename1)[folderLenght + 4: folderLenght + 6].upper() + \
                '_' + str(filename1)[folderLenght + 9: folderLenght + 12] + \
                '_text3.pdf'

                outputName = outputFolderPath + '\\' + outputName
                file1Out = str(filename1)
                file2Out = str(filename2)
                pdfMerge([file1Out, file2Out], outputName)  #  function, works fine
                
                break
            
        print (f'{n}. {os.path.basename(outputName)}')
    
    print('Done.')

mergeFiles('X:\\MergeTest')

I have pairs of PDF files that I need to merge (each pair). The code works fine, but I don't understand why the second loop always starts with the second file of the pair. I thought it would start from the beginning. I was look at this Python - Why does the second for loop start from the second row, but still stuck. Is it same problem? I dont think so. I think it is somewhere here:

fileList1 = Path(folder).glob('*text1.pdf')
fileList2 = Path(folder).glob('*text2.pdf')

start code

#! python3
# merge PDF files based on a /numeric/string/ code in file name
# example1 r01_cz14_028_city_text1.PDF
# example2 r01_cz14_028_city_text2.PDF
import os
from pypdf import PdfWriter
from pathlib import Path

def mergeFiles(folder):
    
    folder = os.path.abspath(folder)   # make sure folder is absolute path

    # make a lists
    fileList1 = Path(folder).glob('*text1.pdf')
    fileList2 = Path(folder).glob('*text2.pdf')

    outputName = ''
    outputFolderPath = folder + '\\somefolder'
    p = Path(outputFolderPath)
    if not p.exists():
            os.makedirs(outputFolderPath)
    n = 0
    folderLenght = len(folder)+1
    fileNameArea = folderLenght+12
    print(f'Adding files in {outputFolderPath}...')


    for filename1 in fileList1:
        n += 1
        
        # match test                   = SECOND LOOP
        for filename2 in fileList2:
            string1 = str(filename1)[folderLenght:fileNameArea].lower()
            string2 = str(filename2)[folderLenght:fileNameArea].lower()

            # Add choosen files in this folder to the PDF file by string.    
            if string1 == string2:
                outputName = 'D' + \
                str(filename1)[folderLenght + 4: folderLenght + 6].upper() + \
                '_' + str(filename1)[folderLenght + 9: folderLenght + 12] + \
                '_text3.pdf'

                outputName = outputFolderPath + '\\' + outputName
                file1Out = str(filename1)
                file2Out = str(filename2)
                pdfMerge([file1Out, file2Out], outputName)  #  function, works fine
                
                break
            
        print (f'{n}. {os.path.basename(outputName)}')
    
    print('Done.')

mergeFiles('X:\\MergeTest')
Share Improve this question asked 22 hours ago PachoPacho 194 bronze badges 2
  • 4 fileList2 is a generator, so you can only iterate over it once - on the second iteration of the outer loop, it will already be completely exhausted, and will produce no values. (fileList1 is also a generator, but that doesn't cause a problem because you only iterate it once.). You need to do something like fileList2 = list(fileList2) that will convert the generator into something that can be iterated multiple times. – jasonharper Commented 22 hours ago
  • @jasonharper thanks for insight. – Pacho Commented 2 hours ago
Add a comment  | 

1 Answer 1

Reset to default 1

Okey based on your code and question, the problem is the outer loop iterates through fileList1 (text1.pdf files). The inner loop iterates through fileList2 (text2.pdf files) to find a match based on the extracted string. When a match is found (string1 == string2), the code merges the corresponding files and then breaks out of the inner loop1.The important thing is that the next iteration of the outer loop will continue from the current position in fileList2, not from the beginning. So, the solution is easy you can move the fileList2 = Path(folder).glob('*text2.pdf') from outside the first for loop to inside the first for loop. This is because when break was called, it exited the inner loop, but the iterator's position in fileList2 was maintained. So, you code now should look like this `

def mergeFiles(folder):

folder = os.path.abspath(folder)   # make sure folder is absolute path

# make a lists
fileList1 = Path(folder).glob('*text1.pdf')


outputName = ''
outputFolderPath = folder + '\\somefolder'
p = Path(outputFolderPath)
if not p.exists():
        os.makedirs(outputFolderPath)
n = 0
folderLenght = len(folder)+1
fileNameArea = folderLenght+12
print(f'Adding files in {outputFolderPath}...')


for filename1 in fileList1:
    n += 1
    fileList2 = Path(folder).glob('*text2.pdf') #Move fileList2 inside first loop
    # match test                   = SECOND LOOP
    for filename2 in fileList2:
        string1 = str(filename1)[folderLenght:fileNameArea].lower()
        string2 = str(filename2)[folderLenght:fileNameArea].lower()

        # Add choosen files in this folder to the PDF file by string.    
        if string1 == string2:
            outputName = 'D' + \
            str(filename1)[folderLenght + 4: folderLenght + 6].upper() + \
            '_' + str(filename1)[folderLenght + 9: folderLenght + 12] + \
            '_text3.pdf'

            outputName = outputFolderPath + '\\' + outputName
            file1Out = str(filename1)
            file2Out = str(filename2)
            pdfMerge([file1Out, file2Out], outputName)  #  function, works fine
            
            break
        
    print (f'{n}. {os.path.basename(outputName)}')

print('Done.')

mergeFiles('X:\\MergeTest')

`

发布评论

评论列表(0)

  1. 暂无评论