Skip to content

Commit

Permalink
Added support for the recent change - change of base url
Browse files Browse the repository at this point in the history
In order to comply with governmental directives as well as in order to progress with harmonising European insolvency data the online portal insolvenzbekanntmachungen.de saw changes in 2021.
  • Loading branch information
NDelventhal committed Jan 16, 2022
1 parent 9e93adb commit 2726fbe
Show file tree
Hide file tree
Showing 5 changed files with 22 additions and 21 deletions.
14 changes: 7 additions & 7 deletions InsolvencyAnnouncementsGer/InsolvencyAnnouncementsGer.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@
"Others", "Decisions taken in the residual-debt exemption proceedings", "Distribution records (§ 188 InsO) of custodians/trustees",
"Monitored insolvency plans", "Dismissals due to lack of assets"]

_default_url = ['https://www.insolvenzbekanntmachungen.de/cgi-bin', '/bl_suche.pl?PHPSESSID=971ac0cbc174f6cfc71ed602a6787558&','Suchfunktion=',
_default_url = ['https://alt.insolvenzbekanntmachungen.de/cgi-bin', '/bl_suche.pl?PHPSESSID=971ac0cbc174f6cfc71ed602a6787558&','Suchfunktion=',
'uneingeschr', '&Absenden=Suche+starten&Bundesland=', '--+Alle+Bundesl%E4nder+--', '&Gericht=','--+Alle+Insolvenzgerichte+--',
'&Datum1=', '', '&Datum2=', '', '&Name=', '', '&Sitz=', '', '&Abteilungsnr=', '', '&Registerzeichen=', '--', '&Lfdnr=',
'', '&Jahreszahl=', '--', '&Registerart=','--+keine+Angabe+--', '&select_registergericht=&Registergericht=', '--+keine+Angabe+--', '&Registernummer=',
Expand Down Expand Up @@ -87,7 +87,7 @@ def regcourts_scr():

def inscourts_scr():
"""
Scrapes the insolvency courts of each German state from the webpage insolvenzbekanntmachungen.de.
Scrapes the insolvency courts of each German state from the webpage alt.insolvenzbekanntmachungen.de.
Returns:
A Dataframe, data columns are as follows:
Expand Down Expand Up @@ -131,14 +131,14 @@ def insol_proc_scr(reg = "",
search_type = "",
ins_court = "",
scrape_html = True):
"""Scrapes insolvency proceedings information on www.insolvenzbekanntmachungen.de.
"""Scrapes insolvency proceedings information on alt.insolvenzbekanntmachungen.de.
Default arguments contain no search specification, with exception of the search type
(default: uneingeschr or unlimited). The unlimited search is limited to data released within the last two weeks,
while for the detailed search sufficient information needs to be entered.
For the detailed search an insolvency court needs to be specified, among family / firm name,
domicile of debtor or the bankruptcy court file number or in case of a registered firm the registration number
and the registration court.
See https://www.insolvenzbekanntmachungen.de/hilfe.html for more information.
See https://alt.insolvenzbekanntmachungen.de/hilfe.html for more information.
Args:
reg (str or list of str): The register type of the proceeding announcement {"GnR", "HRA", "HRB", "PR", "VR"}
Expand Down Expand Up @@ -251,11 +251,11 @@ def insol_proc_scr(reg = "",
return df

def update_url(url = ""):
"""Updates the PHP SESSION ID of the as URL entered link of proceedings from https://www.insolvenzbekanntmachungen.de/,
"""Updates the PHP SESSION ID of the as URL entered link of proceedings from alt.insolvenzbekanntmachungen.de,
which turn invalid after some time.
Args:
url (str): The URL of a proceeding announcementfrom https://www.insolvenzbekanntmachungen.de/ containing a PHP SESSION ID
url (str): The URL of a proceeding announcement from alt.insolvenzbekanntmachungen.de containing a PHP SESSION ID
Returns:
url (str): The URL of the annoucement with a newly generated PHP SESSION ID
Expand Down Expand Up @@ -358,7 +358,7 @@ def decode(text = ""):
def insol_proc_scrprep():
'''
Prepares arguments prior to the insolvency proceedings scraping. Requires user input, confirm entries with keyboard command Enter.
Scrapes insolvency proceedings information on www.insolvenzbekanntmachungen.de. See the documentation for insol_proc_scr() for more information.
Scrapes insolvency proceedings information on alt.insolvenzbekanntmachungen.de. See the documentation for insol_proc_scr() for more information.
Args:
user input, confirm entries with keyboard command Enter
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,7 @@ def _urlencode(text = ""):
date_range = date_from + " - " + date_to
output = [stat, date_range, subject]
for r in _reg:
url = ("https://www.insolvenzbekanntmachungen.de/cgi-bin/bl_suche.pl?PHPSESSID=0bf78007299d3c5cd66ae29a5fbed458&Suchfunktion=uneingeschr&Absenden=Suche+starten&Bundesland=" +
url = ("https://alt.insolvenzbekanntmachungen.de/cgi-bin/bl_suche.pl?PHPSESSID=0bf78007299d3c5cd66ae29a5fbed458&Suchfunktion=uneingeschr&Absenden=Suche+starten&Bundesland=" +
state + "&Gericht=--+Alle+Insolvenzgerichte+--&Datum1=" + date_from + "&Datum2="+ date_to +"&Name=&Sitz=&Abteilungsnr=&Registerzeichen=--&Lfdnr=&Jahreszahl=--&Registerart="+ r +
"&select_registergericht=&Registergericht=--+keine+Angabe+--&Registernummer=&Gegenstand=" + subjects + "&matchesperpage=10&sortedby=Datum&page=2#Ergebnis")
html_page = requests.get(url).content
Expand Down
2 changes: 1 addition & 1 deletion InsolvencyAnnouncementsGer/__init__.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
from .InsolvencyAnnouncementsGer import *
from .InsolvencyAnnouncementsGer_summaries import insol_ann_state_summary

__version__ = '0.1.1'
__version__ = '0.2.1'
__author__ = 'Niall Delventhal'
17 changes: 10 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,9 +18,12 @@

# InsolvencyAnnouncementsGer

InsolvencyAnnouncementsGer is a Python library for searching, viewing and scraping public announcements of German bankruptcy courts from https://www.insolvenzbekanntmachungen.de.
InsolvencyAnnouncementsGer is a Python library for searching, viewing and scraping public announcements of German bankruptcy courts from https://www.insolvenzbekanntmachungen.de.

In order to comply with governmental directives as well as in order to progress with harmonising European insolvency data the online portal saw changes in 2021. As a result the released announcements of the insolvency courts on insolvency proceedings are not provided over a single source anymore. Publications on insolvency proceedings, which have been initiated in 2017 or earlier, are offered through "alt.insolvenzbekanntmachungen.de". And publications on insolvency proceedings, which have been initiated in 2018 or later, are now offered through "neu.insolvenzbekanntmachungen.de". As of now, the library continues to support "alt.insolvenzbekanntmachungen.de".

*Please note:* Downtime of the by the library accessed German justice portal may occur and also changes of the official register may affect the functionality of the library. The recent change of the online portal of German bankruptcy courts described above is outlined to be the first of multiple phases.

*Please note:* Downtime of the by the library accessed German justice portal may occur and also changes of the official register may affect the functionality of the library.

## Background

Expand All @@ -30,7 +33,7 @@ In this context the library's output aims to contribute to transparent research

## Intended Audience: Science/Research

The library target audience is primarily researchers. The library also intends to diminish the barriers non-German speaking researchers may face working with the German justice portal of interest.
The library's target audience is primarily researchers. The library also intends to diminish the barriers non-German speaking researchers may face working with the German justice portal of interest.

## Installation

Expand Down Expand Up @@ -101,9 +104,9 @@ ia.update_url(url)
Updates a single scraped url of an announcement, in case it turned invalid.

```python
ia.insol_ann_state_summary(subject= "Openings", date_from = "24.10.2020", date_to = "28.10.2020"):
ia.insol_ann_state_summary(subject= "Protective measures", date_from = "24.10.2020", date_to = "28.10.2020"):
```
Returns a summary overview of counts of the announcements associated with the specified subject (example: "Openings") by German state and register type as well as non-register linked annoucements of the specified date range.
Returns a summary overview of counts of the announcements associated with the specified subject (example: "Protective measures") by German state and register type as well as non-register linked annoucements of the specified date range.

*For more details please refer to the functions' docstrings.*

Expand All @@ -115,15 +118,15 @@ More information on the insolvency announcement data is available under the foll

## Data protection and online privacy

The library scrapes data from the official register of the German justice portal. According to the German justice portal the following information on an access of the contents from https://www.insolvenzbekanntmachungen.de is stored for six weeks, before the data is made anonymous and is further solely used for statistical purposes:
The library scrapes data from the official register of the German justice portal. According to the German justice portal the following information on an access of the contents from insolvenzbekanntmachungen.de is stored for six weeks, before the data is made anonymous and is further solely used for statistical purposes:

- the name of the file requested
- the date and time of the request
- the quantity of data transmitted
- the error status
- the IP address of the accessing computer

Please refer to https://www.insolvenzbekanntmachungen.de/en/hinweise.html for further information.
Please refer to https://alt.insolvenzbekanntmachungen.de/datenschutz.html for further information.

## Roadmap

Expand Down
8 changes: 3 additions & 5 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,20 +3,18 @@
setuptools.setup(
name="InsolvencyAnnouncementsGer",
packages = ["InsolvencyAnnouncementsGer"],
version="0.1.1",
version="0.2.1",
license='MIT',
url="https://github.com/NDelventhal/InsolvencyAnnouncementsGer",
author="Niall Delventhal",
author_email="[email protected]",
description="InsolvencyAnnouncementsGer is a Python library for searching, viewing and scraping public announcements of German bankruptcy courts from https://www.insolvenzbekanntmachungen.de",
long_description=open('README.md').read(),
long_description_content_type='text/markdown',
download_url = 'https://github.com/NDelventhal/InsolvencyAnnouncementsGer/archive/v_011.tar.gz',
download_url = 'https://github.com/NDelventhal/InsolvencyAnnouncementsGer/archive/v_021.tar.gz',
install_requires=["pandas", "requests", "beautifulsoup4"],
classifiers=['Intended Audience :: Science/Research',
'License :: OSI Approved :: MIT License',
'Development Status :: 3 - Alpha',
# 'Programming Language :: Python',
'Programming Language :: Python :: 3',
'Programming Language :: Python :: 3.6',],
'Programming Language :: Python :: 3.7',],
)

0 comments on commit 2726fbe

Please sign in to comment.