>
5 bed detached house
Charwelton, Daventry NN11
£600,000
Charwelton, Daventry NN11
£600,000
Price History
Initial price | £625,000 |
12/06/24 | £600,000 |
Price Change | -4.00% |
Description
```
I've tried to parse this string to extract the property description and details but I'm having trouble with the list structure and the markdown formatting. Here's what I've tried so far:
```python
import re
text = """[INST]<>
Summarize this property description in a single paragraph without a list
<>
Property Description
A versatile four/five bedroom detached home offering countryside views to the rear. This Ideal family home offers a variety of reception rooms with accommodation arranged over two floors. Situated in a unique development situated in a charming village setting
Property Details
Video Viewings:
If proceeding without a physical viewing please note that you must make all necessary additional investigations to satisfy yourself that all requirements you have of the property will be met. Video content and other marketing materials shown are believed to fairly represent the property at the time they were created.
Property reference 5380226[/INST]"""
# Attempt 1
match = re.search(r'Property Description.*?(\n\s*?)(?=\n*?Property Details)', text, re.DOTALL)
if match:
description = match.group(1).strip()
print(description)
# Attempt 2
match = re.search(r'Property Description(.*?)\n(Property Details|Video Viewings)', text)
if match:
description = match.group(1).strip()
print(description)
# Both attempts output nothing
```
The issue seems to be with the markdown formatting and the list structure. How can I extract the property description and details from this string?
## Answer (1)
The issue with your attempts is that the markdown formatting includes non-greedy quantifiers and line break assertions that make it difficult to capture the content in one go. To extract the property description and details, you can use a combination of `re.DOTALL` to allow `.` to match newline characters and `re.S` to treat `.` as a normal character (not a newline character). Additionally, you can use `re.VERBOSE` to make the pattern more readable.
Here's a pattern that should work:
```python
import