├── .gitignore
├── LICENSE
├── README.md
├── automatizacao_edp.py
├── consult_folder.py
├── create_folder.py
├── download_files.py
├── ftp_server.py
├── img
    ├── capa_video_automatizacao_python.cdr
    └── capa_video_automatizacao_python.png
├── language.py
├── move_files.py
├── parameters.py
├── security.py
├── selenium_webdriver
    ├── Leia-me.txt
    ├── chromedriver.exe
    └── chromedriver_win32.zip
└── unzip_files.py


/.gitignore:
--------------------------------------------------------------------------------
  1 | # Byte-compiled / optimized / DLL files
  2 | __pycache__/
  3 | *.py[cod]
  4 | *$py.class
  5 | 
  6 | # C extensions
  7 | *.so
  8 | 
  9 | # Distribution / packaging
 10 | .Python
 11 | build/
 12 | develop-eggs/
 13 | dist/
 14 | downloads/
 15 | eggs/
 16 | .eggs/
 17 | lib/
 18 | lib64/
 19 | parts/
 20 | sdist/
 21 | var/
 22 | wheels/
 23 | pip-wheel-metadata/
 24 | share/python-wheels/
 25 | *.egg-info/
 26 | .installed.cfg
 27 | *.egg
 28 | MANIFEST
 29 | 
 30 | # PyInstaller
 31 | #  Usually these files are written by a python script from a template
 32 | #  before PyInstaller builds the exe, so as to inject date/other infos into it.
 33 | *.manifest
 34 | *.spec
 35 | 
 36 | # Installer logs
 37 | pip-log.txt
 38 | pip-delete-this-directory.txt
 39 | 
 40 | # Unit test / coverage reports
 41 | htmlcov/
 42 | .tox/
 43 | .nox/
 44 | .coverage
 45 | .coverage.*
 46 | .cache
 47 | nosetests.xml
 48 | coverage.xml
 49 | *.cover
 50 | *.py,cover
 51 | .hypothesis/
 52 | .pytest_cache/
 53 | 
 54 | # Translations
 55 | *.mo
 56 | *.pot
 57 | 
 58 | # Django stuff:
 59 | *.log
 60 | local_settings.py
 61 | db.sqlite3
 62 | db.sqlite3-journal
 63 | 
 64 | # Flask stuff:
 65 | instance/
 66 | .webassets-cache
 67 | 
 68 | # Scrapy stuff:
 69 | .scrapy
 70 | 
 71 | # Sphinx documentation
 72 | docs/_build/
 73 | 
 74 | # PyBuilder
 75 | target/
 76 | 
 77 | # Jupyter Notebook
 78 | .ipynb_checkpoints
 79 | 
 80 | # IPython
 81 | profile_default/
 82 | ipython_config.py
 83 | 
 84 | # pyenv
 85 | .python-version
 86 | 
 87 | # pipenv
 88 | #   According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
 89 | #   However, in case of collaboration, if having platform-specific dependencies or dependencies
 90 | #   having no cross-platform support, pipenv may install dependencies that don't work, or not
 91 | #   install all needed dependencies.
 92 | #Pipfile.lock
 93 | 
 94 | # PEP 582; used by e.g. github.com/David-OConnor/pyflow
 95 | __pypackages__/
 96 | 
 97 | # Celery stuff
 98 | celerybeat-schedule
 99 | celerybeat.pid
100 | 
101 | # SageMath parsed files
102 | *.sage.py
103 | 
104 | # Environments
105 | .env
106 | .venv
107 | env/
108 | venv/
109 | ENV/
110 | env.bak/
111 | venv.bak/
112 | 
113 | # Spyder project settings
114 | .spyderproject
115 | .spyproject
116 | 
117 | # Rope project settings
118 | .ropeproject
119 | 
120 | # mkdocs documentation
121 | /site
122 | 
123 | # mypy
124 | .mypy_cache/
125 | .dmypy.json
126 | dmypy.json
127 | 
128 | # Pyre type checker
129 | .pyre/


--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
 1 | MIT License
 2 | 
 3 | Copyright (c) 2022 Marquescharlon Santos
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy
 6 | of this software and associated documentation files (the "Software"), to deal
 7 | in the Software without restriction, including without limitation the rights
 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 | 
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 | 
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | # Realizar login no site, baixar e descompactar arquivos
 2 | 
 3 | O que fazer quando a empresa não disponibilizar uma API para você poder consumir? <br><br>
 4 | Após alguns dias pesquisando e avaliando as alternativas de como automatizar esse processo, me deparei com Selenium, uma biblioteca poderosa. Com ela é possível realizar autenticação, ações como: navegar no site e baixar arquivos. E é o que essa automatização se propõe a fazer.
 5 | 
 6 | ### Assista o vídeo que você terá uma visão melhor deste projeto:
 7 | 
 8 | [![AUTOMATIZAÇÃO PYTHON](https://github.com/marquescharlon/WebScraping-Selenium-Python/blob/main/img/capa_video_automatizacao_python.png)](https://youtu.be/DkMk85ZxW-k)
 9 | 
10 | ## Parâmetros
11 | Vou destacar aqui os parâmetros principais, os que conversam diretamente com o Selenium. Para isso, acesse o arquivo ```parameters.py```<br>
12 | 
13 | Para definir o endereço do site basta atribuí-lo na variável ```link=''```.
14 | Se o endereço da página onde os arquivos estiverem for fixo poderá definí-lo em ```page_files=''```
15 | 
16 | > Como foi o meu caso, não precisei navegar clicando em diferentes menus até chegar no endereço, mas se precisar, também é possível adaptar conforme sua necessidade.
17 | 
18 | ## Configurar as dependências do Selenium
19 | 
20 | Para permitir realizar o acesso e a navegação pelo navegador através do Selenium se faz necessário baixar **chromedriver_win32.zip**, lembrando que você deverá baixar para o seu navegador em específico caso não seja o Google Chrome. Para mais informações acesse a documentação do [Selenium](https://www.selenium.dev/pt-br/documentation/webdriver/getting_started/install_drivers/)
21 | 
22 | a) Acessar o endereço https://chromedriver.chromium.org/downloads e baixar o arquivo conforme a versão do seu browser. Para saber qual a versão baixar é simples, abra o Google Chrome e vá em ```Configurações > Sobre o Google Chrome```.
23 | 
24 | b) Descompactar o arquivo **chromedriver_win32.zip** <br>
25 | c) Copiar o seu conteúdo para dentro da pasta raiz do Python ```C:\WebScraping-Selenium-Python\selenium_webdriver\```
26 | 
27 | > Outro passo importante é criar um path em variáveis de ambiente, também é simples, só seguir os próximos passos.
28 | 
29 | d) Clicar com botão direito em **Este Computador / Meu Computador** <br>
30 | e) Clicar em Configurações avançadas do sistema <br>
31 | f) Clicar em **Variáveis de Ambiente** <br>
32 | g) Selecionar **Path** e clicar em Editar <br>
33 | h) Clicar em Novo <br>
34 | i) Adicionar o caminho onde colocou o **chromedriver.exe** <br>
35 | 
36 | ## Bibliotecas utilizadas
37 | 
38 | ```os.path```
39 | ```pathlib```
40 | ```selenium```
41 | ```time```
42 | ```ftplib```
43 | ```sys```
44 | ```os```
45 | ```pyodbc```
46 | ```datetime```
47 | ```bs4```
48 | ```zipfile```
49 | ```pyinstaller```
50 | 
51 | ## Etapas implementadas
52 | 
53 | - [x] Acessar o site e realizar o login
54 | - [x] Direcionar até o destino onde estão os arquivos a serem baixados
55 | - [x] Adicionar condição para baixar os arquivos corretos por meio do xPath
56 | - [x] Criar diretório temporário Temp e Duplicados para manipulação dos arquivos
57 | - [x] Validar se a quantidade baixada corresponde a quantidade que foi feito download do site
58 | - [x] Descompactar arquivos .ZIP
59 | - [x] Mover arquivos para a pasta de destino C:\Concessionária\Retorno\
60 | - [x] Registrar log dos arquivos a serem copiados para o servidor
61 | - [x] Copiar arquivos para o servidor
62 | - [x] Realizar também uma cópia para a pasta C:\Concessionária\Retorno\Baixados\
63 | - [x] Excluir arquivos que foram copiados com sucesso da pasta C:\Concessionária\Retorno\
64 | - [x] Adaptar para que seja possível executar para mais de um usuário
65 | - [x] Enviar log via E-mail
66 | 
67 | ## Gerar executável
68 | 
69 | Primeiro será necessário instalar a biblioteca PyInstaller que será responsável por gerar esse arquivo .exe: <br>
70 | ```
71 | pip install PyInstaller
72 | ```
73 | Agora, só acessar a raiz de seu projeto e executar o seguinte comando:
74 | ```
75 | pyinstaller --onefile --noconsole automatizacao_edp.py
76 | ```
77 | 
78 | Se utiliza alguma biblioteca sua ou de terceiros será necessário usar o seguinte comando **--paths=../** para gerar o executável.
79 | 
80 | ```
81 | pyinstaller --onefile --noconsole --paths=../ automatizacao_edp.py
82 | ```
83 | 
84 | > Lembrando que automatizacao_edp.py é o nome do arquivo principal, por isso, é informado na hora de gerar o executável.
85 | 
86 | # Conclusão
87 | 
88 | Essa automatização se fez necessária com a grande demanda de atividades diárias que foram surgindo, a necessidade de diariamente ter que acessar um determinado site e baixar os arquivos de retorno que não poderiam deixar de serem lidos. Fazer esse processo manualmente acaba expondo a erros e consequentemente no desperdício de recursos. Além disso, conversa diretamente com o clientecentrismo, quando podemos diminuir as falhas e agilizar os processos de baixas de suas mensalidades evitando qualquer desconforto no bloqueio dos serviços.
89 | 


--------------------------------------------------------------------------------
/automatizacao_edp.py:
--------------------------------------------------------------------------------
 1 | 
 2 | import download_files
 3 | import ftp_server
 4 | 
 5 | def main() -> None:
 6 | 
 7 |     download_files.download('Bandeirante')
 8 |     download_files.download('ESCELSA')
 9 | 
10 |     ftp_server.copy_70('Bandeirante')
11 |     ftp_server.copy_70('ESCELSA')
12 | 
13 | if __name__ == '__main__':
14 |     main()


--------------------------------------------------------------------------------
/consult_folder.py:
--------------------------------------------------------------------------------
 1 | import os.path
 2 | import parameters as p
 3 | 
 4 | def result_qtd(source_folder, file_begin, file_end):
 5 |     list_files = os.listdir(source_folder)
 6 | 
 7 |     qtd_file = 0
 8 | 
 9 |     for file in list_files:
10 |         if file_end in file:
11 |             if file_begin in file:        
12 |                 qtd_file += 1 
13 | 
14 |     return qtd_file
15 | 
16 | def result_file(source_folder):
17 |     list_files = os.listdir(source_folder)
18 |     return list_files
19 | 
20 | def duplicate_qtd(source_folder,identify_duplicate, identify_duplicate_final):
21 |     list_files = os.listdir(source_folder)
22 |     
23 |     qtd_file = 0
24 | 
25 |     for file in list_files:
26 |         if identify_duplicate in file:
27 |             if identify_duplicate_final in file:        
28 |                 qtd_file += 1 
29 | 
30 |     return qtd_file
31 | 
32 | def await_file(source_folder, file_begin, file_end):
33 |     list_files = os.listdir(source_folder)
34 | 
35 |     qtd_file = 0
36 |     
37 |     for file in list_files:
38 |         if file_begin in file:
39 |             if file_end in file:
40 |                 qtd_file+=1
41 |     return qtd_file
42 | 
43 | 
44 | if __name__ == '__main__':
45 | 
46 |     concessionaria = 'Bandeirante'
47 | 
48 |     print(
49 |       duplicate_qtd(
50 |           p.downloads_folder, 
51 |           p.identify_file(concessionaria, 'begin', duplicate=True), 
52 |           p.identify_file(concessionaria, 'end', duplicate=True)
53 |       )
54 |     )
55 | 
56 |     print(
57 |       result_qtd(
58 |           p.downloads_folder, 
59 |           p.identify_file(concessionaria, 'begin'), 
60 |           p.identify_file(concessionaria, 'end')
61 |       )
62 |     )
63 | 
64 |     print(await_file(
65 |         p.downloads_folder,
66 |         p.identify_file(concessionaria, 'begin'), 
67 |         ".ZIP.crdownload"
68 |     ))


--------------------------------------------------------------------------------
/create_folder.py:
--------------------------------------------------------------------------------
 1 | import os.path
 2 | import language
 3 | from pathlib import Path
 4 | 
 5 | def create_folder(folder_local:str='', folder_name:str=''):
 6 | 
 7 |     """
 8 |     folder_local='Downloads', 'Documentos', 'Imagens' ou 'Videos' \n
 9 |     folder_name='TESTE' \n
10 |     folder_name='TESTE/Subpasta'
11 |     """
12 | 
13 |     def folder_name_local(folder_local):
14 |         folder_ = language.translation(folder_local)
15 |         return folder_
16 | 
17 |     folder_path = os.path.expanduser(rf"~\{folder_name_local(folder_local)}")
18 |         
19 |     print(folder_path)
20 | 
21 |     def folder_mount(folder_path, folder_name):
22 |         print(folder_name)
23 |         
24 |         _path = folder_path
25 |         _path = os.path.join(rf'{_path}',folder_name)
26 |         print(_path)
27 |         if Path(_path).is_dir():
28 |             print("Diretório '%s' já existe." %_path)
29 |         try:
30 |             os.makedirs(_path)
31 |         except FileExistsError as erro:
32 |             print(erro)
33 | 
34 |     folder_mount(folder_path, folder_name)
35 | 
36 | 
37 | if __name__ == '__main__':
38 | 
39 |     create_folder("Documentos", "EDP/Temp/Duplicados")


--------------------------------------------------------------------------------
/download_files.py:
--------------------------------------------------------------------------------
  1 | from selenium import webdriver
  2 | from selenium.webdriver.chrome.options import Options
  3 | from time import sleep
  4 | from bs4 import BeautifulSoup
  5 | import security as pw
  6 | import pandas as pd
  7 | import create_folder as cf
  8 | import move_files as mf
  9 | import consult_folder
 10 | import parameters as p
 11 | import unzip_files
 12 | 
 13 | def download(concessionaria:str=''):
 14 | 
 15 |     """
 16 |     Informar o nome da concessionária: concessionaria=''
 17 |     """
 18 | 
 19 |     username_field = pw.user(concessionaria, 'User')
 20 |     password_field = pw.user(concessionaria, 'Password')
 21 | 
 22 |     options = Options()
 23 |     # options.add_argument("--headless")
 24 |     options.add_argument('--no-sandbox')
 25 |     options.add_argument('--disable-gpu')  # usado apenas no windows
 26 |     # options.add_argument('start-maximized')
 27 |     options.add_argument('disable-infobars')
 28 |     options.add_argument("--disable-extensions")
 29 |     navegador = webdriver.Chrome(p.executable_path, chrome_options=options)
 30 |     navegador.minimize_window()
 31 |     navegador.get(url=p.link)
 32 |     print ("Headless Chrome Initialized on Windows OS")
 33 | 
 34 |     # Por questões de segurança algumas informações foram excluídas como xPath
 35 |     # Nota que o xPath não é simplesmente '//*[@id=""]', ele possui mais informações dentro das ""
 36 | 
 37 |     input_email = navegador.find_element("xpath", '//*[@id=""]') 
 38 |     input_email.send_keys(username_field)
 39 | 
 40 |     input_password = navegador.find_element("xpath", '//*[@id=""]')
 41 |     input_password.send_keys(password_field)
 42 | 
 43 |     button_enter = navegador.find_element("xpath", '//*[@id=""]')
 44 |     button_enter.click()
 45 | 
 46 |     navegador.get(p.page_files)
 47 | 
 48 |     qtd_site = 0
 49 | 
 50 |     def get_FezDownload(line):
 51 |         xpath = '//*[@id="TableIndex"]/tbody/tr['+line+']'
 52 |         file = navegador.find_element("xpath", xpath)
 53 |         html_returns = file.get_attribute('innerHTML')
 54 |         records = BeautifulSoup(html_returns, 'html.parser')
 55 |         for row in records.find_all("td", attrs={"class": "text-center"}):
 56 |             x = row.text
 57 |             x = ''.join(filter(str.isalnum, x))
 58 |             if x == "Sim":
 59 |                 return 1
 60 |             if x == "Não":
 61 |                 return 0
 62 | 
 63 |     file_df = {'file_name': []}
 64 |     file_df = pd.DataFrame(file_df)
 65 | 
 66 |     for i in range(10):
 67 |         line = str(i+1)
 68 |         xpath = '//*[@id="TableIndex"]/tbody/tr['+line+']/td[1]/a'
 69 |         file = navegador.find_element("xpath", xpath)
 70 | 
 71 |         soup = BeautifulSoup(file.text, 'html.parser')
 72 |         file_df.loc[i] = [str(soup)]
 73 | 
 74 |         if get_FezDownload(line) == 0:
 75 |             file.click()
 76 |             qtd_site += 1
 77 |         
 78 |     cf.create_folder("Documentos", "EDP/Temp")
 79 |     cf.create_folder("Documentos", "EDP/Temp/Duplicados")
 80 | 
 81 |     files_folder_ = consult_folder.result_qtd(
 82 |         p.downloads_folder,
 83 |         p.identify_file(concessionaria, 'begin'),  
 84 |         p.identify_file(concessionaria, 'end')
 85 |     )
 86 | 
 87 |     duplicate_qtd = consult_folder.duplicate_qtd(
 88 |         p.downloads_folder, 
 89 |         p.identify_file(concessionaria, 'begin', duplicate=True), 
 90 |         p.identify_file(concessionaria, 'end', duplicate=True)
 91 |     )
 92 | 
 93 |     qtd_site -= duplicate_qtd
 94 | 
 95 |     await_file_ = consult_folder.await_file(
 96 |             p.downloads_folder,
 97 |             p.identify_file(concessionaria, 'begin'), 
 98 |             ".ZIP.crdownload"
 99 |         )
100 | 
101 |     if files_folder_ >=  qtd_site:
102 |         if await_file_ >= 1:
103 |             sleep(15)
104 |         mf.move_files(
105 |             p.downloads_folder, 
106 |             p.identify_file(concessionaria, 'begin'), 
107 |             p.identify_file(concessionaria, 'end'), 
108 |             p.temp_folder, 
109 |             qtd_site
110 |         )
111 |         mf.move_files_duplicates(
112 |             p.downloads_folder,
113 |             p.identify_file(concessionaria, 'begin', duplicate=True), 
114 |             p.identify_file(concessionaria, 'end', duplicate=True), 
115 |             p.temp_duplicados
116 |         )      
117 |         unzip_files.files(
118 |             p.temp_folder, 
119 |             p.identify_file(concessionaria, 'begin'), 
120 |             p.identify_file(concessionaria, 'end'), 
121 |             delete_file=True
122 |         )
123 |         mf.move_files(
124 |             p.temp_folder, 
125 |             p.identify_file(concessionaria, 'begin', unzip=True), 
126 |             p.identify_file(concessionaria, 'end', unzip=True), 
127 |             p.return_folder(concessionaria)
128 |         )
129 | 
130 |     navegador.quit()
131 | 
132 | if __name__ == '__main__':
133 |     
134 |     download('Bandeirantes')
135 |     download('ESCELSA')
136 | 


--------------------------------------------------------------------------------
/ftp_server.py:
--------------------------------------------------------------------------------
 1 | from ftplib import FTP
 2 | import security as pw
 3 | import sys
 4 | import os
 5 | import parameters as p
 6 | sys.path.insert(0, 'C:\\Repositorios\\Projetos_Python\\')
 7 | import function_models.function_models as fm
 8 | 
 9 | ftp_70 = FTP()
10 | 
11 | # Evite adicionar senha aqui
12 | # Se adicionado no arquivo security.py você poderá chamá-lo assim como fiz aqui
13 | 
14 | def copy_70(concessionaria):
15 | 
16 |     global str_log
17 |     str_log = ""
18 | 
19 |     folder_136 = f'C:\\{concessionaria}\\Retorno\\'
20 |     path_70 = f'/Cobranca/{concessionaria}/Retorno'
21 |  
22 |     try:
23 |         ftp_70.connect(pw.ftp70_host, pw.ftp70_porta)
24 |         ftp_70.login(pw.ftp70_username, pw.ftp70_password)
25 |         ftp_70.encoding = 'utf-8' 
26 |         str_log += fm.LogMensagem('C_SFTP',servidor=pw.ftp70_host)
27 |         ftp_70.nlst()
28 | 
29 |         ftp_70.cwd(f'/Cobranca/{concessionaria}/Retorno')
30 | 
31 |         list_files = os.listdir(folder_136)
32 | 
33 |         qtd_file = 0
34 | 
35 |         for file in list_files:
36 |             if '.TXT' in file:
37 |                 fileObject  = open(folder_136 + file, 'rb')
38 |                 ftp_70.storlines('STOR ' + file, fileObject)
39 |                 ftp_70.retrbinary('RETR ' + file,
40 |                                   open(folder_136 + 'Baixados\\' + file, 'wb').write)
41 |                 fileObject.close()
42 | 
43 |                 str_log += fm.LogMensagem('SCA',filename=file,servidor=p.folder_70 + path_70)
44 |                 qtd_file+=1 
45 | 
46 |                 if os.path.exists(folder_136 + file):
47 |                     os.remove(folder_136 + file)
48 | 
49 |         if qtd_file == 0:
50 |             str_log += fm.LogMensagem(tipo='P', mensagem='Site verificado, porém, nenhum retorno para ser baixado.')
51 | 
52 |         fm.EnviaEmail(destinatario=pw.email_cobranca, assunto_email='Log Retorno ' + concessionaria, modelo_html='', mensagem=str_log, 
53 |         cabecalho=True, titulo_cabecalho=concessionaria)
54 |      
55 |     except Exception as ex:
56 |         str_log += fm.LogMensagem('E_SFTP',servidor=pw.ftp70_host,erro=ex)
57 | 
58 | if __name__ == '__main__':
59 |     
60 |     copy_70('Bandeirante')
61 |     copy_70('ESCELSA')


--------------------------------------------------------------------------------
/img/capa_video_automatizacao_python.cdr:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/marquescharlon/Automatizacao-Selenium-Python/4ded05350eaf8bb11ac953125dcd3493cbc310a3/img/capa_video_automatizacao_python.cdr


--------------------------------------------------------------------------------
/img/capa_video_automatizacao_python.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/marquescharlon/Automatizacao-Selenium-Python/4ded05350eaf8bb11ac953125dcd3493cbc310a3/img/capa_video_automatizacao_python.png


--------------------------------------------------------------------------------
/language.py:
--------------------------------------------------------------------------------
 1 | 
 2 | def translation(folder):
 3 |     if folder == "Downloads":
 4 |         return "Downloads"
 5 |     elif folder == "Documentos" or folder == "Documents":
 6 |         return "Documents"
 7 |     elif folder == "Imagens" or folder == "Pictures":
 8 |         return "Pictures"
 9 |     elif folder == "Videos":
10 |         return "Videos"
11 |     else:
12 |         return folder


--------------------------------------------------------------------------------
/move_files.py:
--------------------------------------------------------------------------------
  1 | from contextlib import nullcontext
  2 | import os.path
  3 | from pathlib import Path
  4 | import consult_folder
  5 | import parameters as p
  6 | 
  7 | def move_files(source_folder:str='', file_begin:str='', file_end:str='', destination_folder:str='', qtd_site=0):
  8 | 
  9 |     """
 10 |     Pasta onde se encontram os arquivos: source_folder='' \n
 11 |         Pasta Downloads: p.downloads_folder \n
 12 |         Pasta Temp: p.temp_folder \n
 13 |         Pasta Temp/Duplicados: p.temp_duplicados \n
 14 |         Ou a pasta de retorno: return_folder(concessionaria) \n
 15 |     Pesquisar pelo início do arquivo: file_begin='' \n
 16 |         p.identify_file(concessionaria, 'begin') \n
 17 |         p.identify_file(concessionaria, 'begin', duplicate=True) \n
 18 |         p.identify_file(concessionaria, 'begin', unzip=True) \n
 19 |     Pesquisar pelo final do arquivo: file_end='' \n
 20 |         p.identify_file(concessionaria, 'end') \n
 21 |         p.identify_file(concessionaria, 'end', duplicate=True) \n
 22 |         p.identify_file(concessionaria, 'end', unzip=True) \n
 23 |     Pasta de destino: destination_folder='' \n
 24 |     Verificar se a quantidade baixada é o mesmo que consta na pasta: qtd_site=''
 25 |     """
 26 | 
 27 |     qtd_site_ = qtd_site
 28 | 
 29 |     files_folder = consult_folder.result_qtd(source_folder, file_begin, file_end)
 30 |     qtd_files_folder = 0
 31 | 
 32 |     list_files =  consult_folder.result_file(source_folder)
 33 | 
 34 |     if qtd_site_ != 0:
 35 | 
 36 |         if int(qtd_site) != int(files_folder):
 37 |             qtd_files_folder = files_folder
 38 |             
 39 |             while qtd_site > qtd_files_folder:
 40 |                 qtd_files_folder = 0
 41 |                 for file in list_files:
 42 |                     if file_begin in file:
 43 |                         if file_end in file:
 44 |                             print(file)
 45 |                             qtd_files_folder += 1
 46 |                             list_files =  consult_folder.result_file(source_folder)
 47 | 
 48 |         if int(qtd_site) == int(qtd_files_folder) or int(qtd_site) == int(files_folder):
 49 |             list_files = os.listdir(source_folder)
 50 |             print(list_files)
 51 |             for file in list_files:
 52 |                 if file_begin in file:
 53 |                     if file_end in file:
 54 |                         try:
 55 |                             if os.path.exists(f"{destination_folder}/{file}"):
 56 |                                 os.remove(f"{destination_folder}/{file}")
 57 |                             os.rename(f"{source_folder}/{file}", f"{destination_folder}/{file}")
 58 |                         except FileNotFoundError as erro:
 59 |                             print(erro)
 60 |                         except TypeError as erro:
 61 |                             print(erro)
 62 |                         except Exception as erro:
 63 |                             print("Ocorreu um erro diferente dos que já foram tratados ao tentar mover arquivo(s)")
 64 |     else:
 65 |         list_files = os.listdir(source_folder)
 66 |         for file in list_files:
 67 |             if file_begin in file:
 68 |                 if file_end in file:
 69 |                     try:
 70 |                         if os.path.exists(f"{destination_folder}/{file}"):
 71 |                             os.remove(f"{destination_folder}/{file}")
 72 |                         os.rename(f"{source_folder}/{file}", f"{destination_folder}/{file}")
 73 |                     except FileNotFoundError as erro:
 74 |                         print(erro)
 75 |                     except TypeError as erro:
 76 |                         print(erro)
 77 |                     except Exception as erro:
 78 |                         print("Ocorreu um erro diferente dos que já foram tratados ao tentar mover arquivo(s)")
 79 | 
 80 | def move_files_duplicates(source_folder:str='',identify_duplicate:str='', identify_duplicate_final:str='', destination_folder:str=''):
 81 | 
 82 |     """
 83 |     Pasta onde se encontram os arquivos: source_folder='' \n
 84 |         Pasta Downloads: p.downloads_folder \n
 85 |         Pasta Temp: p.temp_folder \n
 86 |         Pasta Temp/Duplicados: p.temp_duplicados \n
 87 |         Ou a pasta de retorno: return_folder(concessionaria) \n
 88 |     Pesquisar pelo início do arquivo: identify_duplicate='' \n
 89 |         p.identify_file(concessionaria, 'begin', duplicate=True) \n
 90 |     Pesquisar pelo final do arquivo: identify_duplicate_final='' \n
 91 |         p.identify_file(concessionaria, 'end', duplicate=True) \n
 92 |     Pasta de destino: destination_folder=''
 93 |     """
 94 | 
 95 |     list_files = os.listdir(source_folder)
 96 | 
 97 |     for file in list_files:
 98 |         if identify_duplicate in file:
 99 |             if identify_duplicate_final in file:        
100 |                 try:
101 |                     if os.path.exists(f"{destination_folder}/{file}"):
102 |                         os.remove(f"{destination_folder}/{file}")
103 |                     os.rename(f"{source_folder}/{file}", f"{destination_folder}/{file}")
104 |                 except FileNotFoundError as erro:
105 |                     print(erro)
106 |                 except FileExistsError as erro:
107 |                     print(erro)
108 | 
109 | if __name__ == '__main__':
110 | 
111 |     concessionaria = ''
112 | 
113 |     move_files(
114 |         p.downloads_folder, 
115 |         p.identify_file(concessionaria, 'begin'), 
116 |         p.identify_file(concessionaria, 'end'),
117 |         p.temp_folder,
118 |         9
119 |     )
120 | 
121 |     move_files_duplicates(
122 |         p.downloads_folder,
123 |         p.identify_file(concessionaria, 'begin', duplicate=True), 
124 |         p.identify_file(concessionaria, 'end', duplicate=True), 
125 |         p.temp_duplicados
126 |     )
127 | 
128 |     move_files(
129 |         p.temp_folder, 
130 |         p.identify_file(concessionaria, 'begin', unzip=True), 
131 |         p.identify_file(concessionaria, 'end', unzip=True), 
132 |         p.return_folder(concessionaria)
133 |     )


--------------------------------------------------------------------------------
/parameters.py:
--------------------------------------------------------------------------------
 1 | # Instalação
 2 | 
 3 | # pip install selenium
 4 | # pip install requests
 5 | # pip install bs4
 6 | # pip install pathlib
 7 | # pip install sleep
 8 | 
 9 | import os.path
10 | import security as pw # Interessante toda senha que necessitar utilizar adicionar em um arquivo protegido
11 | 
12 | folder_70 = fr''
13 | folder_siparq = fr''
14 | link = ''
15 | page_files = '' 
16 | executable_path = r'C:\WebScraping-Selenium-Python\selenium_webdriver\chromedriver.exe'
17 |     
18 | 
19 | # Obter o caminho das pastas
20 | downloads_folder = os.path.expanduser(r"~\Downloads")
21 | temp_folder = os.path.expanduser(r"~\Documents/EDP/Temp")
22 | temp_duplicados = os.path.expanduser(r"~\Documents/EDP/Temp/Duplicados")
23 | 
24 | 
25 | def return_folder(concessionaria:'str'=''):
26 |     """
27 |     Informar o nome da concessionária: concessionaria='Nome'
28 |     """
29 |     if concessionaria == 'Bandeirante':
30 |         path = os.path.expanduser(r"C:\Bandeirante\Retorno")
31 |         return path
32 |     if concessionaria == 'ESCELSA':
33 |         path = os.path.expanduser(r"C:\ESCELSA\Retorno")
34 |         return path
35 | 
36 | def identify_file(concessionaria:'str'='', position:'str'='', unzip=False, duplicate=False):
37 |     """
38 |     Informar o nome da concessionária: concessionaria='Nome' \n
39 |     Pesquisar pelo início do arquivo: position='begin' \n
40 |     Pesquisar pelo final do arquivo: position='end' \n
41 |     Se o arquivo estiver descompactado informe: unzip=True \n
42 |     Se a busca é por arquivos duplicados informe: duplicate=True
43 |     """
44 | 
45 |     # Utilizado para identificar arquivos que pertencem a cada fornecedor
46 |     if concessionaria == 'Bandeirante':
47 |         if position == 'begin':
48 |             if unzip == True:
49 |                 file_begin_unzip_band = "CARTAO_"
50 |                 return file_begin_unzip_band
51 |             if duplicate == True:
52 |                 identify_duplicate = "CARTAO_TODOS_"
53 |                 return identify_duplicate
54 |             else:
55 |                 file_begin_band = "CARTAO_"
56 |                 return file_begin_band
57 |         elif position == 'end':
58 |             if unzip == True:
59 |                 file_end_unzip_band = "_CONV.TXT"
60 |                 return file_end_unzip_band
61 |             if duplicate == True:
62 |                 identify_duplicate_final = ").ZIP"
63 |                 return identify_duplicate_final
64 |             else:
65 |                 file_end_band = "_CONV.ZIP"
66 |                 return file_end_band
67 |     if concessionaria == 'ESCELSA':
68 |         if position == 'begin':
69 |             if unzip == True:
70 |                 file_begin_escelsa = "498_COBR_"
71 |                 return file_begin_escelsa
72 |             if duplicate == True:
73 |                 identify_duplicate = "498_COBR_"
74 |                 return identify_duplicate
75 |             else:
76 |                 file_begin_escelsa = "498_COBR_"
77 |                 return file_begin_escelsa
78 |         elif position == 'end':
79 |             if unzip == True:
80 |                 file_end_unzip_escelsa = "_CONV.TXT"
81 |                 return file_end_unzip_escelsa
82 |             if duplicate == True:
83 |                 identify_duplicate_final = ").ZIP"
84 |                 return identify_duplicate_final
85 |             else:
86 |                 file_end_escelsa = "_CONV.ZIP"
87 |                 return file_end_escelsa
88 | 
89 | if __name__ == '__main__':
90 | 
91 |     print(identify_file('Bandeirante', 'begin'))
92 |     print(identify_file('Bandeirante', 'end', unzip=True))
93 |     print(identify_file('Bandeirante', 'begin', duplicate=True))
94 | 
95 |     print(return_folder('Bandeirante'))
96 | 


--------------------------------------------------------------------------------
/security.py:
--------------------------------------------------------------------------------
 1 | # FTP
 2 | ftp70_host = ""
 3 | ftp70_username = ""
 4 | ftp70_password = ""
 5 | ftp70_porta = 0
 6 | 
 7 | # Servidor
 8 | siparq = ""
 9 | 
10 | # Conexão com o Banco  
11 | driver = "{SQL Server Native Client 11.0}"
12 | name_db = ""
13 | user_db = ""
14 | password_db = ""
15 | 
16 | # E-mail
17 | email_cobranca = ""
18 | email_teste = ""
19 | 
20 | def user(concessionaria, tipo):
21 |     if concessionaria == 'Bandeirante':
22 |         if tipo == 'User':
23 |             username_field = ''
24 |             return str(username_field)
25 |         if tipo == 'Password':
26 |             password_field = ''
27 |             return str(password_field)
28 |     if concessionaria == 'ESCELSA':
29 |         if tipo == 'User':
30 |             username_field = ''
31 |             return str(username_field)
32 |         if tipo == 'Password':
33 |             password_field = ''
34 |             return str(password_field)


--------------------------------------------------------------------------------
/selenium_webdriver/Leia-me.txt:
--------------------------------------------------------------------------------
 1 | Descompactar o arquivo "chromedriver_win32.zip"
 2 | E copiar o seu conteúdo para dentro da pasta raiz do Python
 3 | 
 4 | 1ª) Clicar com botão direito em "Este Computador / Meu Computador"
 5 | 2ª) Clicar em Configurações avançadas do sistema
 6 | 3ª) Clicar em "Variáveis de Ambiente"
 7 | 4ª) Selecionar "Path" e clicar em Editar
 8 | 5ª) Clicar em Novo
 9 | 6ª) Adicionar o caminho onde colocou o "chromedriver.exe"
10 | 
11 | Ou acesse o site para baixá-lo.
12 | https://chromedriver.chromium.org/downloads
13 | https://stackoverflow.com/questions/29858752/error-message-chromedriver-executable-needs-to-be-available-in-the-path


--------------------------------------------------------------------------------
/selenium_webdriver/chromedriver.exe:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/marquescharlon/Automatizacao-Selenium-Python/4ded05350eaf8bb11ac953125dcd3493cbc310a3/selenium_webdriver/chromedriver.exe


--------------------------------------------------------------------------------
/selenium_webdriver/chromedriver_win32.zip:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/marquescharlon/Automatizacao-Selenium-Python/4ded05350eaf8bb11ac953125dcd3493cbc310a3/selenium_webdriver/chromedriver_win32.zip


--------------------------------------------------------------------------------
/unzip_files.py:
--------------------------------------------------------------------------------
 1 | import os.path
 2 | import os
 3 | import zipfile as zf
 4 | 
 5 | def files(source_folder:str='', file_begin:str='', file_end:str='', delete_file=False):
 6 | 
 7 |     """
 8 |     Pasta onde se encontram os arquivos: source_folder='' \n
 9 |     Pesquisar pelo início do arquivo: file_begin='' \n
10 |         p.identify_file(concessionaria, 'begin') \n
11 |     Pesquisar pelo final do arquivo: file_end='' \n
12 |         p.identify_file(concessionaria, 'end') \n
13 |     Deseja excluir o arquivo .ZIP? delete_file=True
14 |     """
15 | 
16 |     files = os.listdir(source_folder)
17 | 
18 |     for file in files:
19 |         if file_begin in file:
20 |             if file_end in file:
21 | 
22 |                 path = os.path.join(rf'{source_folder}', file)
23 | 
24 |                 with zf.ZipFile(rf"{path}") as z:
25 |                     z.extractall(rf'{source_folder}')
26 | 
27 |                 if delete_file == True:
28 |                     os.remove(rf"{path}")
29 | 
30 | if __name__ == '__main__':
31 | 
32 |     import parameters as p
33 | 
34 |     concessionaria = ''
35 | 
36 |     files(
37 |         p.temp_folder, 
38 |         p.identify_file(concessionaria, 'begin'), 
39 |         p.identify_file(concessionaria, 'end'), 
40 |         delete_file=False
41 |     )


--------------------------------------------------------------------------------