-
Is there a way to identify a checkbox and read its value? Thx |
Beta Was this translation helpful? Give feedback.
Replies: 3 comments 1 reply
-
Atypical "Discussions" post -> converting ... |
Beta Was this translation helpful? Give feedback.
-
If these things are no widget, then there are 2 more possibilities:
In your case, these are no characters, so I checked for drawings like follows: # make a rectangle starting from "Gender:" and ending at right page border:
r=page.search_for("gender:")[0]
r.x1=page.rect.width # extend to right border
r.y0-=5 # decrease upper bound a bit
r.y1+=5 # increase lower bound somewhat
print(page.get_text(clip=r)) # extract text within to confirm we are ok
Gender:
Female
Male
# we can also determine the rectangle for the words:
pprint(page.get_text("words", clip=r))
[(204.47999572753906,
123.88499450683594,
237.09597778320312,
136.25099182128906,
'Gender:',
0,
0,
0),
(256.0799865722656,
123.88499450683594,
286.1669921875,
136.25099182128906,
'Female',
0,
1,
0),
(307.55999755859375,
123.88499450683594,
326.9639892578125,
136.25099182128906,
'Male',
0,
2,
0)] Now extract drawings within that rect and look at them paths = [p for p in page.get_drawings() if p["rect"] in r] # vector graphics within my rectangle
len(paths) # we have 4 different ones!
4
from pprint import pprint
pprint(paths[0]) # first one is a rectangle
{'closePath': False,
'color': (0.0, 0.0, 0.0),
'dashes': '[] 0',
'even_odd': None,
'fill': None,
'fill_opacity': None,
'items': [('re',
Rect(243.1199951171875, 125.280029296875, 252.36000061035156, 134.52003479003906),
-1)],
'layer': '',
'lineCap': (2, 2, 2),
'lineJoin': 0.0,
'rect': Rect(243.1199951171875, 125.280029296875, 252.36000061035156, 134.52003479003906),
'seqno': 41,
'stroke_opacity': 1.0,
'type': 's',
'width': 0.7200000286102295}
pprint(paths[1]) # second one is another rectangle
{'closePath': False,
'color': (0.0, 0.0, 0.0),
'dashes': '[] 0',
'even_odd': None,
'fill': None,
'fill_opacity': None,
'items': [('re',
Rect(294.6000061035156, 125.280029296875, 303.8399963378906, 134.52003479003906),
-1)],
'layer': '',
'lineCap': (2, 2, 2),
'lineJoin': 0.0,
'rect': Rect(294.6000061035156, 125.280029296875, 303.8399963378906, 134.52003479003906),
'seqno': 44,
'stroke_opacity': 1.0,
'type': 's',
'width': 0.7200000286102295}
pprint(paths[2]) # 3rd vector graphic is a line
{'closePath': False,
'color': (0.0, 0.0, 0.0),
'dashes': '[] 0',
'even_odd': None,
'fill': None,
'fill_opacity': None,
'items': [('l',
Point(294.4800109863281, 125.15997314453125),
Point(303.96002197265625, 134.6399688720703))],
'layer': '',
'lineCap': (2, 2, 2),
'lineJoin': 0.0,
'rect': Rect(294.4800109863281, 125.15997314453125, 303.96002197265625, 134.6399688720703),
'seqno': 45,
'stroke_opacity': 1.0,
'type': 's',
'width': 0.47999998927116394}
pprint(paths[3]) # number 4 another line
{'closePath': False,
'color': (0.0, 0.0, 0.0),
'dashes': '[] 0',
'even_odd': None,
'fill': None,
'fill_opacity': None,
'items': [('l', # line
Point(303.9599914550781, 125.15997314453125), # start point
Point(294.47998046875, 134.6399688720703))], # end point
'layer': '',
'lineCap': (2, 2, 2),
'lineJoin': 0.0,
'rect': Rect(294.47998046875, 125.15997314453125, 303.9599914550781, 134.6399688720703),
'seqno': 46,
'stroke_opacity': 1.0,
'type': 's',
'width': 0.47999998927116394} Maybe it looks complicated ... blame the PDF creator! |
Beta Was this translation helpful? Give feedback.
-
Ok, Thx |
Beta Was this translation helpful? Give feedback.
Ok, Thx