import job from csv, using rubocop to fix convention code
Check out, review, and merge locally
Step 1. Fetch and check out the branch for this merge request
git fetch origin git checkout -b ftp-import origin/ftp-import
Step 2. Review the changes locally
Step 3. Merge the branch and fix any conflicts that come up
git checkout master git merge --no-ff ftp-import
Step 4. Push the result of the merge to GitLab
git push origin master
Note that pushing to GitLab requires write access to this repository.
Tip: You can also checkout merge requests locally by following these guidelines.
-
Tô Ngọc Ánh @anhtn
added 1 commit
- 43e10e2d - Edit gitignore
added 1 commit * 43e10e2d - Edit gitignore [Compare with previous version](https://gitlab.zigexn.vn/anhtn/VeNJob/merge_requests/5/diffs?diff_id=4853&start_sha=802a0c27eaae0d5bd208fbda1f554560eaf7ee4f)Toggle commit list -
Tô Ngọc Ánh @anhtn
added 1 commit
- 94c7493b - Remove file in lib/data folder
added 1 commit * 94c7493b - Remove file in lib/data folder [Compare with previous version](https://gitlab.zigexn.vn/anhtn/VeNJob/merge_requests/5/diffs?diff_id=4854&start_sha=43e10e2da58b4ca33d6d8555f436e23d96559499)Toggle commit list -
11 11 # 12 12 default: &default 13 13 adapter: mysql2 14 encoding: utf8 14 encoding: utf8mb4 -
Thanh Hung Pham @hungpt commentedMasterEdited
@anhtn
utf8mb4ý nghĩa là gì vậy em? Tại sao dùng ở đây em?@anhtn `utf8mb4` ý nghĩa là gì vậy em? Tại sao dùng ở đây em? -
Tô Ngọc Ánh @anhtn commentedMasterEdited
The difference between utf8 and utf8mb4 is that the former can only store 3 byte characters, while the latter can store 4 byte characters. In Unicode terms, utf8 can only store characters in the Basic Multilingual Plane, while utf8mb4 can store any Unicode character.
xài utf8 nó bị lỗi mysql2 Incorrect string value: '\xF0\x9F\x93\xB1Ph...', e search cách fix thì ra utf8mb4
*The difference between utf8 and utf8mb4 is that the former can only store 3 byte characters, while the latter can store 4 byte characters. In Unicode terms, utf8 can only store characters in the Basic Multilingual Plane, while utf8mb4 can store any Unicode character.* xài utf8 nó bị lỗi mysql2 Incorrect string value: '\xF0\x9F\x93\xB1Ph...', e search cách fix thì ra utf8mb4
Please register or sign in to reply -
-
51 uri = URI.parse(CGI.escape(job_link)) # fix error: uri must be ascii only 52 document = Nokogiri::HTML(URI.open(uri)) 53 job_title = document.at_css('.job-desc p.title').text 54 return if job_title.empty? 55 56 job_company_link = document.at_css('.job-desc a.job-company-name')[:href] 57 job_company = crawl_company(job_company_link) 58 return if job_company.nil? 59 60 job_location_name = document.css('.map p a').map { |val| val.text.strip } 61 job_locations = Location.where(city: job_location_name) 62 63 job_industry_names = document.at_xpath('//li[./strong/em[contains(@class, "mdi mdi-briefcase")]]').css('p a').map { |val| val.text.strip } 64 job_industries = Industry.where(name: job_industry_names) 65 66 job_salary = document.at_xpath('//li[./strong/i[contains(@class, "fa fa-usd")]]/p').try(:text).try(:strip) -
Thanh Hung Pham @hungpt commentedMasterEdited
@anhtn Sao lại dùng
.try(:text).try(:strip)vậy em?@anhtn Sao lại dùng `.try(:text).try(:strip)` vậy em? -
Tô Ngọc Ánh @anhtn commentedMasterEdited
tại cái object đôi khi nó bị nil -> method text và strip sẽ bị lỗi nên e dùng try để kiểm tra nếu nil thì không thực hiện á a
tại cái object đôi khi nó bị nil -> method text và strip sẽ bị lỗi nên e dùng try để kiểm tra nếu nil thì không thực hiện á a -
Tô Ngọc Ánh @anhtn
changed this line in version 5 of the diff
changed this line in version 5 of the diff
changed this line in [version 5 of the diff](https://gitlab.zigexn.vn/anhtn/VeNJob/merge_requests/5/diffs?diff_id=4867&start_sha=ee6a23cd72f84d16ace617ed025b55542aeafacd#b321b772986de9dfe9db0ed4138ae166e577f241_66_0)Toggle commit list
-
-
47 47 gem 'listen', '>= 3.0.5', '< 3.2' 48 48 # Spring speeds up development by keeping your application running in the background. Read more: https://github.com/rails/spring 49 49 gem 'spring' 50 gem 'spring-watcher-listen', '~> 2.0.0' 51 50 gem 'dotenv-rails' 51 gem 'spring-watcher-listen', '~> 2.0.0' -
Thanh Hung Pham @hungpt commentedMaster
-
Tô Ngọc Ánh @anhtn commentedMasterEdited
gem spring này mặc định tạo project nó có, tại rubocop nó kêu dotenv phải để trc spring nên e thay đổi theo. Gem này hình như để tăng tốc độ khi chạy trên localhost á a
gem spring này mặc định tạo project nó có, tại rubocop nó kêu dotenv phải để trc spring nên e thay đổi theo. Gem này hình như để tăng tốc độ khi chạy trên localhost á a
-
-
1 require "open-uri" 1 require 'open-uri' 2 2 3 @logger ||= Logger.new("#{Rails.root}/log/crawler.log") 3 4 4 5 namespace :crawl do 5 desc "crawl industries locations jobs" 6 task :crawl_industries_locations_jobs, [:page, :link] => [:environment] do |task, args| 6 desc 'crawl industries locations jobs' 7 task :crawl_industries_locations_jobs, %i[page link] => [:environment] do |_, args| -
Thanh Hung Pham @hungpt commentedMasterEdited
@anhtn chỗ này
[:environment]nghĩa là gì vậy em?@anhtn chỗ này `[:environment]` nghĩa là gì vậy em? -
Tô Ngọc Ánh @anhtn commentedMasterEdited
thêm cái này để tương tác được với DB á a
thêm cái này để tương tác được với DB á a -
Tô Ngọc Ánh @anhtn
changed this line in version 5 of the diff
changed this line in version 5 of the diff
changed this line in [version 5 of the diff](https://gitlab.zigexn.vn/anhtn/VeNJob/merge_requests/5/diffs?diff_id=4867&start_sha=ee6a23cd72f84d16ace617ed025b55542aeafacd#b321b772986de9dfe9db0ed4138ae166e577f241_7_0)Toggle commit list
-
-
lib/tasks/ftp_import.rake 0 → 100644
1 require 'csv' 2 require 'zip' 3 require_relative '../common/ftp' 4 5 namespace :ftp_import do -
Thanh Hung Pham @hungpt commentedMasterEdited
@anhtn sữa tên file và tên namespace lại cho đúng.
csv_importđúng ngữ cảnh hơn.@anhtn sữa tên file và tên namespace lại cho đúng. `csv_import` đúng ngữ cảnh hơn. -
Tô Ngọc Ánh @anhtn
changed this line in version 4 of the diff
changed this line in version 4 of the diff
changed this line in [version 4 of the diff](https://gitlab.zigexn.vn/anhtn/VeNJob/merge_requests/5/diffs?diff_id=4862&start_sha=94c7493b09b80a59666bfeb851662342fe2d0c86#dcd6e8014a70f3396bf4cf32a6198934e8eb5594_5_0)Toggle commit list
-
-
lib/tasks/ftp_import.rake 0 → 100644
20 Zip::File.open(file) do |zip_file| 21 zip_file.each do |f| 22 fpath = File.join(destination, f.name) 23 zip_file.extract(f, fpath) unless File.exist?(fpath) 24 end 25 end 26 end 27 28 def import_job(direction) 29 # i = 0 30 CSV.foreach("#{direction}/jobs.csv", headers: true) do |row| 31 # i+=1 32 next if row['name'].blank? || !row['category'].is_a?(String) || row['company name'].blank? 33 34 title = row['name'].strip 35 company = Company.find_or_create_by(name: row['company name']) do |c| -
Thanh Hung Pham @hungpt commentedMasterEdited
@anhtn tên biến vẫn để
cem? Đặt tên cho dễ hiểu nhá.@anhtn tên biến vẫn để `c` em? Đặt tên cho dễ hiểu nhá. -
Tô Ngọc Ánh @anhtn commentedMasterEdited
tại ở trên e gán giá trị cho biến company rồi nên e sử dụng c ở trong cho khỏi trùng
tại ở trên e gán giá trị cho biến company rồi nên e sử dụng c ở trong cho khỏi trùng -
Tô Ngọc Ánh @anhtn
changed this line in version 4 of the diff
changed this line in version 4 of the diff
changed this line in [version 4 of the diff](https://gitlab.zigexn.vn/anhtn/VeNJob/merge_requests/5/diffs?diff_id=4862&start_sha=94c7493b09b80a59666bfeb851662342fe2d0c86#dcd6e8014a70f3396bf4cf32a6198934e8eb5594_35_0)Toggle commit list
-
-
lib/tasks/ftp_import.rake 0 → 100644
46 locations = Location.where(city: locations_name) 47 locations = locations_name.map { |city| Location.create(oversea: false, city: city) } if locations.empty? 48 description = "Benefits:\n#{row['benefit']}\n"\ 49 "Descriptions:\n#{row['description']}\n"\ 50 "Requirements:\n#{row['requirement']}" 51 52 Job.find_or_create_by(title: title, company_id: company.id, level: level, salary: salary) do |job| 53 job.industries << industry 54 job.locations << locations 55 job.description = description 56 end 57 puts title 58 end 59 rescue StandardError => e 60 puts e 61 @logger.error e.message -
Thanh Hung Pham @hungpt commentedMasterEdited
@anhtn ở đây nên log ra mình bị
errorở row thứ mấy, data như nào bị exception nha em.@anhtn ở đây nên log ra mình bị `error` ở row thứ mấy, data như nào bị exception nha em. -
Tô Ngọc Ánh @anhtn
changed this line in version 4 of the diff
changed this line in version 4 of the diff
changed this line in [version 4 of the diff](https://gitlab.zigexn.vn/anhtn/VeNJob/merge_requests/5/diffs?diff_id=4862&start_sha=94c7493b09b80a59666bfeb851662342fe2d0c86#dcd6e8014a70f3396bf4cf32a6198934e8eb5594_61_0)Toggle commit list
-
-
Tô Ngọc Ánh @anhtn
added 1 commit
- ee6a23cd - csv import: prevent datatype error
added 1 commit * ee6a23cd - csv import: prevent datatype error [Compare with previous version](https://gitlab.zigexn.vn/anhtn/VeNJob/merge_requests/5/diffs?diff_id=4862&start_sha=94c7493b09b80a59666bfeb851662342fe2d0c86)Toggle commit list -
Tô Ngọc Ánh @anhtn
added 1 commit
- c99884c7 - separating def into class
added 1 commit * c99884c7 - separating def into class [Compare with previous version](https://gitlab.zigexn.vn/anhtn/VeNJob/merge_requests/5/diffs?diff_id=4867&start_sha=ee6a23cd72f84d16ace617ed025b55542aeafacd)Toggle commit list -
2 @logger ||= Logger.new("#{Rails.root}/log/crawler.log") 3 4 namespace :crawl do 5 desc "crawl industries locations jobs" 6 task :crawl_industries_locations_jobs, [:page, :link] => [:environment] do |task, args| 7 args.with_defaults(link: 'https://careerbuilder.vn/viec-lam/tat-ca-viec-lam-vi.html') 8 crawl_industries_and_locations 9 job_links = get_job_links(args[:page].to_i, args[:link]) 1 require 'open-uri' 2 3 class Crawler 4 def initialize(logger) 5 @logger = logger 6 end 7 8 def crawl_data(page, base_link) -
Thanh Hung Pham @hungpt commentedMasterEdited
@anhtn change name
pagethànhpage_numbercho dễ hiểu em!@anhtn change name `page` thành `page_number` cho dễ hiểu em! -
Tô Ngọc Ánh @anhtn
changed this line in version 7 of the diff
changed this line in version 7 of the diff
changed this line in [version 7 of the diff](https://gitlab.zigexn.vn/anhtn/VeNJob/merge_requests/5/diffs?diff_id=4875&start_sha=931465fed4132d35a839ac648e16aa2c43ca02c5#4440733d81f6278a884736182acb2be64e423d01_8_8)Toggle commit list
-
-
lib/common/csv.rb 0 → 100644
1 require 'csv' 2 require './lib/common/extract_zip' 3 4 class CsvImport 5 include ExtractZip 6 7 def initialize(logger) 8 @logger = logger 9 end 10 11 def import_job(direction) 12 index = 0 13 CSV.foreach("#{direction}/jobs.csv", headers: true) do |row| -
Thanh Hung Pham @hungpt commentedMasterEdited
@anhtn chỗ này em thử method
with_indexchưa em? không cần phải khai báo biếnCSV.foreach("#{direction}/jobs.csv", headers: true).with_index(1) do |row, index|@anhtn chỗ này em thử method `with_index` chưa em? không cần phải khai báo biến `CSV.foreach("#{direction}/jobs.csv", headers: true).with_index(1) do |row, index|` -
Tô Ngọc Ánh @anhtn
changed this line in version 6 of the diff
changed this line in version 6 of the diff
changed this line in [version 6 of the diff](https://gitlab.zigexn.vn/anhtn/VeNJob/merge_requests/5/diffs?diff_id=4873&start_sha=c99884c73f7bae243262f2ba11290345e70c197e#b5e578995bd192fd1aba3e421655f3ff0e2931e4_13_12)Toggle commit list
-
-
lib/common/csv.rb 0 → 100644
1 require 'csv' 2 require './lib/common/extract_zip' 3 4 class CsvImport 5 include ExtractZip 6 7 def initialize(logger) 8 @logger = logger 9 end 10 11 def import_job(direction) 12 index = 0 13 CSV.foreach("#{direction}/jobs.csv", headers: true) do |row| 14 index += 1 15 next if integer?(row['category']) -
Thanh Hung Pham @hungpt commentedMasterEdited
@anhtn Chỗ này nên kiểm tra
row['category']nil và empty trước nha em.@anhtn Chỗ này nên kiểm tra `row['category']` nil và empty trước nha em. -
Tô Ngọc Ánh @anhtn
changed this line in version 6 of the diff
changed this line in version 6 of the diff
changed this line in [version 6 of the diff](https://gitlab.zigexn.vn/anhtn/VeNJob/merge_requests/5/diffs?diff_id=4873&start_sha=c99884c73f7bae243262f2ba11290345e70c197e#b5e578995bd192fd1aba3e421655f3ff0e2931e4_15_12)Toggle commit list
-
-
lib/common/csv.rb 0 → 100644
33 34 Job.find_or_create_by(title: title, company_id: company.id, level: level, salary: salary) do |job| 35 job.industries << industry 36 job.locations << locations 37 job.description = description 38 end 39 puts title 40 end 41 rescue StandardError => e 42 puts e 43 @logger.error "Job #{index}: #{e.message}" 44 end 45 46 private 47 48 def integer?(str) -
Thanh Hung Pham @hungpt commentedMasterEdited
@anhtn Thử dùng cách khác để kiểm tra nha em. thử dùng
regexxem sao em@anhtn Thử dùng cách khác để kiểm tra nha em. thử dùng `regex` xem sao em -
Tô Ngọc Ánh @anhtn
changed this line in version 6 of the diff
changed this line in version 6 of the diff
changed this line in [version 6 of the diff](https://gitlab.zigexn.vn/anhtn/VeNJob/merge_requests/5/diffs?diff_id=4873&start_sha=c99884c73f7bae243262f2ba11290345e70c197e#b5e578995bd192fd1aba3e421655f3ff0e2931e4_48_42)Toggle commit list
-
-
Tô Ngọc Ánh @anhtn
resolved all discussions
resolved all discussions
resolved all discussionsToggle commit list -
Tô Ngọc Ánh @anhtn
added 1 commit
- 931465fe - fix no directory, refactor code in csv.rb
added 1 commit * 931465fe - fix no directory, refactor code in csv.rb [Compare with previous version](https://gitlab.zigexn.vn/anhtn/VeNJob/merge_requests/5/diffs?diff_id=4873&start_sha=c99884c73f7bae243262f2ba11290345e70c197e)Toggle commit list -
Tô Ngọc Ánh @anhtn
added 1 commit
- 7df5cbbd - change page variable => page_number
added 1 commit * 7df5cbbd - change page variable => page_number [Compare with previous version](https://gitlab.zigexn.vn/anhtn/VeNJob/merge_requests/5/diffs?diff_id=4875&start_sha=931465fed4132d35a839ac648e16aa2c43ca02c5)Toggle commit list -
Tô Ngọc Ánh @anhtn
mentioned in commit 3a9ba394
mentioned in commit 3a9ba394
mentioned in commit 3a9ba394f68919daecfb082f586f4553b43f6460Toggle commit list -
Tô Ngọc Ánh @anhtn
merged
merged
mergedToggle commit list