Skip to content

  • Projects
  • Groups
  • Snippets
  • Help
  • This project
    • Loading...
  • Sign in / Register
V
venjob_nth
  • Overview
    • Overview
    • Details
    • Activity
    • Cycle Analytics
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
    • Charts
  • Issues 0
    • Issues 0
    • List
    • Board
    • Labels
    • Milestones
  • Merge Requests 3
    • Merge Requests 3
  • CI / CD
    • CI / CD
    • Pipelines
    • Jobs
    • Schedules
    • Charts
  • Wiki
    • Wiki
  • Members
  • Collapse sidebar
  • Activity
  • Graph
  • Charts
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
  • Ngô Trung Hưng
  • venjob_nth
  • Merge Requests
  • !2

Merged
Opened Jul 28, 2020 by Ngô Trung Hưng@hungnt 
  • Report abuse
Report abuse

done crawler

×

Check out, review, and merge locally

Step 1. Fetch and check out the branch for this merge request

git fetch origin
git checkout -b crawler origin/crawler

Step 2. Review the changes locally

Step 3. Merge the branch and fix any conflicts that come up

git checkout master
git merge --no-ff crawler

Step 4. Push the result of the merge to GitLab

git push origin master

Note that pushing to GitLab requires write access to this repository.

Tip: You can also checkout merge requests locally by following these guidelines.

  • Discussion 69
  • Commits 14
  • Pipelines 14
  • Changes 19
{{ resolvedDiscussionCount }}/{{ discussionCount }} {{ resolvedCountText }} resolved
  • Hoang Phuc Do
    @phucdh started a discussion on an old version of the diff Jul 28, 2020
    Last updated by Ngô Trung Hưng Jul 28, 2020
    lib/src/interface_web.rb 0 → 100644
    135 exprience = exp
    136 add_data(name, company_name, city_name, created_date, expiration_date, salary, industry_name, description, level, exprience)
    137 end
    138
    139 def self.crawl_data_jobs_interface_2(page)
    140 name = page.search('.apply-now-content .job-desc .title').text
    141 company_name = page.search('.top-job .top-job-info .tit_company').text
    142 location = []
    143 length = page.search('.info-workplace .value a').size
    144 length.times do |n|
    145 location << page.search(".info-workplace .value a:nth-child(#{n + 1})").text
    146 end
    147 city_name = location.join(',')
    148 created_date = ''
    149 expiration_date = page.search('.info li:nth-child(4)').text
    150 expiration_date = if expiration_date.blank?
    • Hoang Phuc Do @phucdh commented Jul 28, 2020
      Master

      Trường hợp chỉ có một điều kiện if else, có thể dùng toán tử 3 ngôi (Ternary Operator)

      expiration_date = expiration_date.present? ? expiration_date.to_s.delete!("[\n,\t,\r]").split(' ').last : ''
      Edited Aug 04, 2020
      Trường hợp chỉ có một điều kiện if else, có thể dùng toán tử 3 ngôi (Ternary Operator) ```ruby expiration_date = expiration_date.present? ? expiration_date.to_s.delete!("[\n,\t,\r]").split(' ').last : '' ```
    • Ngô Trung Hưng @hungnt commented Jul 28, 2020
      Master

      Dạ anh

      Dạ anh
    • Ngô Trung Hưng @hungnt

      changed this line in version 2 of the diff

      Jul 28, 2020

      changed this line in version 2 of the diff

      changed this line in [version 2 of the diff](https://gitlab.zigexn.vn/hungnt/venjob_nth/merge_requests/2/diffs?diff_id=4863&start_sha=b5e63c5d62c9b87e1966cb1b6976ee0a23c0a62f#2081481459f47e5e9b1f786abd02f6662de28d8d_150_155)
      Toggle commit list
    Please register or sign in to reply
  • Hoang Phuc Do
    @phucdh started a discussion on an old version of the diff Jul 28, 2020
    Resolved by Ngô Trung Hưng Aug 04, 2020
    lib/src/interface_web.rb 0 → 100644
    150 expiration_date = if expiration_date.blank?
    151 ''
    152 else
    153 expiration_date.to_s.delete!("[\n,\t,\r]").split(' ').last
    154 end
    155 salary = page.search('.info li:nth-child(3)').text.split('Lương').last.strip
    156 industry_name = page.search('.info li:nth-child(5) .value').text
    157 description = page.search('.left-col').to_s
    158 lv = page.search('.boxtp .info li:nth-child(2)').text
    159 level = if lv.blank?
    160 ''
    161 else
    162 lv.delete!("[\n,\t,\r]").strip.split('Cấp bậc').last.strip
    163 end
    164 exp = page.search('.info li:nth-child(6)').text
    165 exprience = if exp.blank?
    • Hoang Phuc Do @phucdh commented Jul 28, 2020
      Master

      Trường hợp chỉ có một điều kiện if else, có thể dùng toán tử 3 ngôi (Ternary Operator)

      exprience = exp.present? ? exp.delete!("[\n,\t,\r]").split('Kinh nghiệm').last.strip : ''
      Edited Aug 04, 2020
      Trường hợp chỉ có một điều kiện if else, có thể dùng toán tử 3 ngôi (Ternary Operator) ```ruby exprience = exp.present? ? exp.delete!("[\n,\t,\r]").split('Kinh nghiệm').last.strip : '' ```
    • Ngô Trung Hưng @hungnt

      changed this line in version 2 of the diff

      Jul 28, 2020

      changed this line in version 2 of the diff

      changed this line in [version 2 of the diff](https://gitlab.zigexn.vn/hungnt/venjob_nth/merge_requests/2/diffs?diff_id=4863&start_sha=b5e63c5d62c9b87e1966cb1b6976ee0a23c0a62f#2081481459f47e5e9b1f786abd02f6662de28d8d_165_162)
      Toggle commit list
    Please register or sign in to reply
  • Hoang Phuc Do
    @phucdh started a discussion on an old version of the diff Jul 28, 2020
    Last updated by Ngô Trung Hưng Jul 28, 2020
    lib/src/interface_web.rb 0 → 100644
    138
    139 def self.crawl_data_jobs_interface_2(page)
    140 name = page.search('.apply-now-content .job-desc .title').text
    141 company_name = page.search('.top-job .top-job-info .tit_company').text
    142 location = []
    143 length = page.search('.info-workplace .value a').size
    144 length.times do |n|
    145 location << page.search(".info-workplace .value a:nth-child(#{n + 1})").text
    146 end
    147 city_name = location.join(',')
    148 created_date = ''
    149 expiration_date = page.search('.info li:nth-child(4)').text
    150 expiration_date = if expiration_date.blank?
    151 ''
    152 else
    153 expiration_date.to_s.delete!("[\n,\t,\r]").split(' ').last
    • Hoang Phuc Do @phucdh commented Jul 28, 2020
      Master

      Mục đích sử dụng #to_s ở đây là gì ?

      Mục đích sử dụng `#to_s` ở đây là gì ?
    • Ngô Trung Hưng @hungnt commented Jul 28, 2020
      Master

      chỗ này e bị dư rồi

      chỗ này e bị dư rồi
    • Ngô Trung Hưng @hungnt

      changed this line in version 2 of the diff

      Jul 28, 2020

      changed this line in version 2 of the diff

      changed this line in [version 2 of the diff](https://gitlab.zigexn.vn/hungnt/venjob_nth/merge_requests/2/diffs?diff_id=4863&start_sha=b5e63c5d62c9b87e1966cb1b6976ee0a23c0a62f#2081481459f47e5e9b1f786abd02f6662de28d8d_153_155)
      Toggle commit list
    Please register or sign in to reply
  • Hoang Phuc Do
    @phucdh started a discussion on an old version of the diff Jul 28, 2020
    Last updated by Ngô Trung Hưng Jul 28, 2020
    lib/src/interface_web.rb 0 → 100644
    34 def self.safe_link(url)
    35 Nokogiri::HTML(URI.parse(URI.escape(url)))
    36 end
    37
    38 def self.craw_data_cities
    39 page = Nokogiri::HTML(URI.open('https://careerbuilder.vn/viec-lam/tat-ca-viec-lam-vi.html'))
    40 puts "Crawling data location... \n. \n. \n."
    41 data_list_cities = []
    42 data = page.search('#location option')
    43 list_cities = data.to_s.split('</option>')
    44 list_cities.each do |x|
    45 data_list_cities << x.gsub(/(^<[\w\D]*>)/, '').gsub(/\n/, '').rstrip
    46 end
    47 puts 'Save data to database...'
    48 data_list_cities.each_with_index do |val, index|
    49 area = index > 69 ? 0 : 1
    • Hoang Phuc Do @phucdh commented Jul 28, 2020
      Master

      69, 0, 1 là các magic number, có thể gán constant với tên làm rõ mục đích cho các giá trị này

      `69`, `0`, `1` là các magic number, có thể gán constant với tên làm rõ mục đích cho các giá trị này
    • Ngô Trung Hưng @hungnt commented Jul 28, 2020
      Master

      dạ anh

      dạ anh
    • Ngô Trung Hưng @hungnt

      changed this line in version 2 of the diff

      Jul 28, 2020

      changed this line in version 2 of the diff

      changed this line in [version 2 of the diff](https://gitlab.zigexn.vn/hungnt/venjob_nth/merge_requests/2/diffs?diff_id=4863&start_sha=b5e63c5d62c9b87e1966cb1b6976ee0a23c0a62f#2081481459f47e5e9b1f786abd02f6662de28d8d_49_53)
      Toggle commit list
    Please register or sign in to reply
  • Hoang Phuc Do
    @phucdh started a discussion on the diff Jul 28, 2020
    Last updated by Hoang Phuc Do Jul 28, 2020
    lib/tasks/crawler.rake 0 → 100644
    1 # frozen_string_literal: true
    2
    3 require 'open-uri'
    • Hoang Phuc Do @phucdh commented Jul 28, 2020
      Master

      Mục đích của việc require các thư viện open-uri, logger trong rake này là gì?

      Edited Jul 29, 2020
      Mục đích của việc require các thư viện `open-uri`, `logger` trong rake này là gì?
    • Ngô Trung Hưng @hungnt commented Jul 28, 2020
      Master

      logger là e định ghi log ra open-uri để sử dụng Nokogiri::HTML(URI.open('link')) nếu k require 'open-uri' sẽ báo lỗi: NoMethodError (private method `open' called for URI:Module)

      logger là e định ghi log ra open-uri để sử dụng Nokogiri::HTML(URI.open('link')) nếu k require 'open-uri' sẽ báo lỗi: NoMethodError (private method `open' called for URI:Module)
    • Hoang Phuc Do @phucdh commented Jul 28, 2020
      Master

      Nơi sử dụng thư viện này là trong lib/src/interface_web.rb, vậy thì phải require thư viện này trong file trên mới chính xác ?

      Edited Jul 29, 2020
      Nơi sử dụng thư viện này là trong `lib/src/interface_web.rb`, vậy thì phải require thư viện này trong file trên mới chính xác ?
    Please register or sign in to reply
  • Hoang Phuc Do
    @phucdh started a discussion on an old version of the diff Jul 28, 2020
    Last updated by Ngô Trung Hưng Jul 28, 2020
    lib/src/interface_web.rb 0 → 100644
    50 City.find_or_create_by(name: val) do |city|
    51 city.name = val
    52 city.area = area
    53 end
    54 end
    55 end
    56
    57 def self.craw_data_companies
    58 puts 'Crawl data companies'
    59 link_crawl = get_link_job_and_companies
    60 link_crawl[0].each do |url|
    61 page = Nokogiri::HTML(URI.open(URI.parse(URI.escape(url))))
    62 name = ''
    63 address = ''
    64 desc = ''
    65 if page.search('.company-info .info .content .name').text == ''
    • Hoang Phuc Do @phucdh commented Jul 28, 2020
      Master

      if page.search('.company-info .info .content .name').text.blank?

      `if page.search('.company-info .info .content .name').text.blank?`
    • Ngô Trung Hưng @hungnt

      changed this line in version 2 of the diff

      Jul 28, 2020

      changed this line in version 2 of the diff

      changed this line in [version 2 of the diff](https://gitlab.zigexn.vn/hungnt/venjob_nth/merge_requests/2/diffs?diff_id=4863&start_sha=b5e63c5d62c9b87e1966cb1b6976ee0a23c0a62f#2081481459f47e5e9b1f786abd02f6662de28d8d_65_69)
      Toggle commit list
    Please register or sign in to reply
  • Hoang Phuc Do
    @phucdh started a discussion on an old version of the diff Jul 28, 2020
    Last updated by Ngô Trung Hưng Jul 28, 2020
    lib/src/interface_web.rb 0 → 100644
    55 end
    56
    57 def self.craw_data_companies
    58 puts 'Crawl data companies'
    59 link_crawl = get_link_job_and_companies
    60 link_crawl[0].each do |url|
    61 page = Nokogiri::HTML(URI.open(URI.parse(URI.escape(url))))
    62 name = ''
    63 address = ''
    64 desc = ''
    65 if page.search('.company-info .info .content .name').text == ''
    66 name = page.search('.section-page #cp_company_name').text
    67 address = page.search('.section-page .cp_basic_info_details ul li:nth-child(1)').text
    68 desc = page.search('.cp_aboutus_item .content_fck').text
    69 else
    70 name = page.search('.company-info .info .content .name').text
    • Hoang Phuc Do @phucdh commented Jul 28, 2020
      Master

      Dòng này page.search('.company-info .info .content .name').text bị trùng với phần ở trên, gom lại thành một biến hoặc hàm để dùng chung

      Dòng này `page.search('.company-info .info .content .name').text` bị trùng với phần ở trên, gom lại thành một biến hoặc hàm để dùng chung
    • Ngô Trung Hưng @hungnt

      changed this line in version 2 of the diff

      Jul 28, 2020

      changed this line in version 2 of the diff

      changed this line in [version 2 of the diff](https://gitlab.zigexn.vn/hungnt/venjob_nth/merge_requests/2/diffs?diff_id=4863&start_sha=b5e63c5d62c9b87e1966cb1b6976ee0a23c0a62f#2081481459f47e5e9b1f786abd02f6662de28d8d_70_75)
      Toggle commit list
    Please register or sign in to reply
  • Ngô Trung Hưng @hungnt

    added 1 commit

    • 5683fa11 - fix crawler

    Compare with previous version

    Jul 28, 2020

    added 1 commit

    • 5683fa11 - fix crawler

    Compare with previous version

    added 1 commit * 5683fa11 - fix crawler [Compare with previous version](https://gitlab.zigexn.vn/hungnt/venjob_nth/merge_requests/2/diffs?diff_id=4863&start_sha=b5e63c5d62c9b87e1966cb1b6976ee0a23c0a62f)
    Toggle commit list
  • Hoang Phuc Do
    @phucdh started a discussion on an old version of the diff Jul 28, 2020
    Last updated by Ngô Trung Hưng Jul 28, 2020
    lib/src/interface_web.rb 0 → 100644
    1 # frozen_string_literal: true
    2
    3 require 'open-uri'
    4
    5 # Crawler data
    6 class InterfaceWeb
    7 INTERNATION = 0
    • Hoang Phuc Do @phucdh commented Jul 28, 2020
      Master

      Quốc tế => International Trong nước => Domestic

      Quốc tế => International Trong nước => Domestic
    • Ngô Trung Hưng @hungnt commented Jul 28, 2020
      Master

      Thanks anh

      Edited Jul 28, 2020 by Ngô Trung Hưng
      Thanks anh
    • Ngô Trung Hưng @hungnt

      changed this line in version 3 of the diff

      Jul 28, 2020

      changed this line in version 3 of the diff

      changed this line in [version 3 of the diff](https://gitlab.zigexn.vn/hungnt/venjob_nth/merge_requests/2/diffs?diff_id=4864&start_sha=5683fa1122dbbd23778eab566fda50595be10b7d#2081481459f47e5e9b1f786abd02f6662de28d8d_7_7)
      Toggle commit list
    Please register or sign in to reply
  • Hoang Phuc Do
    @phucdh started a discussion on an old version of the diff Jul 28, 2020
    Last updated by Ngô Trung Hưng Jul 29, 2020
    lib/src/interface_web.rb 0 → 100644
    1 # frozen_string_literal: true
    2
    3 require 'open-uri'
    4
    5 # Crawler data
    6 class InterfaceWeb
    7 INTERNATION = 0
    8 VIETNAM = 1
    9 RANGE = 69
    • Hoang Phuc Do @phucdh commented Jul 28, 2020
      Master

      Tại sao range lại là 69, có thể lấy 70 được không ?

      Tại sao range lại là 69, có thể lấy 70 được không ?
    • Ngô Trung Hưng @hungnt commented Jul 28, 2020
      Master
      e đang sử dụng data_list_cities.each_with_index do |val, index|
      index chạy từ 0, có 70 vị tri nằm trong nước nên e cho chạy tới 69
      ```ruby e đang sử dụng data_list_cities.each_with_index do |val, index| index chạy từ 0, có 70 vị tri nằm trong nước nên e cho chạy tới 69 ```
    • Ngô Trung Hưng @hungnt

      changed this line in version 6 of the diff

      Jul 29, 2020

      changed this line in version 6 of the diff

      changed this line in [version 6 of the diff](https://gitlab.zigexn.vn/hungnt/venjob_nth/merge_requests/2/diffs?diff_id=4882&start_sha=8fecd429de43ac36f58321166eb4084a2eda70b9#2081481459f47e5e9b1f786abd02f6662de28d8d_12_10)
      Toggle commit list
    Please register or sign in to reply
  • Hoang Phuc Do
    @phucdh started a discussion on an old version of the diff Jul 28, 2020
    Last updated by Ngô Trung Hưng Jul 28, 2020
    lib/src/interface_web.rb 0 → 100644
    20 page = Nokogiri::HTML(URI.open("https://careerbuilder.vn/viec-lam/tat-ca-viec-lam-trang-#{i + 1}-vi.html"))
    21 link_companies = page.search('.figcaption .caption @href')
    22 website_companies += link_companies.map(&:value).uniq
    23 link_jobs = page.search('.figcaption .title .job_link @href')
    24 website_jobs += link_jobs.map(&:value)
    25 break if website_jobs.include?(@@stop_crawl)
    26 end
    27 website_companies = website_companies.select { |val| val.present? && val != 'javascript:void(0);' }
    28 website_jobs = website_jobs.select(&:present?)
    29 puts "Result:\nCompany: #{website_companies.length} link\nJob : #{website_jobs.length} link\n--------------"
    30 File.write('tmp/link.txt', website_jobs[0])
    31 data << website_companies << website_jobs
    32 end
    33
    34 def self.link_job_and_companies
    35 @link_job_and_companies ||= crawl_link(3)
    • Hoang Phuc Do @phucdh commented Jul 28, 2020
      Master

      Ở đây sử dụng memorized variable có tên là link_job_and_companies

      variable này là class variable hay instance variable?

      Edited Jul 28, 2020 by Hoang Phuc Do
      Ở đây sử dụng memorized variable có tên là `link_job_and_companies` variable này là class variable hay instance variable?
    • Ngô Trung Hưng @hungnt commented Jul 28, 2020
      Master

      @link_job_and_companies sẽ là instance variable

      @link_job_and_companies sẽ là instance variable
    • Hoang Phuc Do @phucdh commented Jul 28, 2020
      Master

      @link_job_and_companies, là một instance variable, được khai báo trong self.link_job_and_companies, là một class method

      Để tạo một instance của InterfaceWeb, sử dụng cú pháp sau:

      crawler = InterfaceWeb.new

      Nếu @link_job_and_companies là instance variable, vậy thì từ crawler có thể truy cập được variable này không ? Nếu không thì @link_job_and_companies có thực sự là instance variable không ?

      Edited Jul 28, 2020 by Hoang Phuc Do
      `@link_job_and_companies`, là một instance variable, được khai báo trong `self.link_job_and_companies`, là một class method Để tạo một instance của `InterfaceWeb`, sử dụng cú pháp sau: ```ruby crawler = InterfaceWeb.new ``` Nếu `@link_job_and_companies` là instance variable, vậy thì từ crawler có thể truy cập được variable này không ? Nếu không thì `@link_job_and_companies` có thực sự là instance variable không ?
    • Ngô Trung Hưng @hungnt

      changed this line in version 3 of the diff

      Jul 28, 2020

      changed this line in version 3 of the diff

      changed this line in [version 3 of the diff](https://gitlab.zigexn.vn/hungnt/venjob_nth/merge_requests/2/diffs?diff_id=4864&start_sha=5683fa1122dbbd23778eab566fda50595be10b7d#2081481459f47e5e9b1f786abd02f6662de28d8d_35_36)
      Toggle commit list
    Please register or sign in to reply
  • Hoang Phuc Do
    @phucdh started a discussion on an old version of the diff Jul 28, 2020
    Last updated by Ngô Trung Hưng Jul 29, 2020
    lib/src/interface_web.rb 0 → 100644
    73 desc = page.search('.cp_aboutus_item .content_fck').text
    74 else
    75 name = company_name.strip
    76 address = page.search('.company-info .info .content p:nth-child(3)').text
    77 desc = page.search('.main-about-us .content').text
    78 end
    79 begin
    80 if name.present? && address.present? && desc.present?
    81 Company.find_or_create_by(name: name.strip) do |company|
    82 company.name = name.strip
    83 company.address = address
    84 company.short_description = desc
    85 end
    86 puts name
    87 end
    88 rescue StandardError => e
    • Hoang Phuc Do @phucdh commented Jul 28, 2020
      Master

      Format code như thế này được gọi lại nhiều hơn 1 lần

      begin
        # Your code
      rescue StandardError => e
        puts e
      end

      Có thể refactor lại đoạn code này để dùng chung được không ?

      Format code như thế này được gọi lại nhiều hơn 1 lần ```ruby begin # Your code rescue StandardError => e puts e end ``` Có thể refactor lại đoạn code này để dùng chung được không ?
    • Ngô Trung Hưng @hungnt commented Jul 28, 2020
      Master

      bỏ begin với end

        # Your code
      rescue StandardError => e
        puts e
      bỏ begin với end ```ruby # Your code rescue StandardError => e puts e ```
    • Hoang Phuc Do @phucdh commented Jul 28, 2020
      Master

      Đoạn # Your code là thay đổi, giữ nguyên begin, rescue. Có thể dùng cách nào để refactor ?

      Edited Jul 28, 2020 by Hoang Phuc Do
      Đoạn `# Your code` là thay đổi, giữ nguyên begin, rescue. Có thể dùng cách nào để refactor ?
    • Ngô Trung Hưng @hungnt

      changed this line in version 6 of the diff

      Jul 29, 2020

      changed this line in version 6 of the diff

      changed this line in [version 6 of the diff](https://gitlab.zigexn.vn/hungnt/venjob_nth/merge_requests/2/diffs?diff_id=4882&start_sha=8fecd429de43ac36f58321166eb4084a2eda70b9#2081481459f47e5e9b1f786abd02f6662de28d8d_89_65)
      Toggle commit list
    Please register or sign in to reply
  • Hoang Phuc Do
    @phucdh started a discussion on an old version of the diff Jul 28, 2020
    Last updated by Ngô Trung Hưng Jul 28, 2020
    lib/src/interface_web.rb 0 → 100644
    176 exprience = page.search('.DetailJobNew li:nth-child(5) span').text.strip
    177 add_data(name, company_name, city_name, created_date, expiration_date, salary, industry_name, description, level, exprience)
    178 end
    179
    180 def self.make_foreign_industries_table(data, id_job)
    181 content = data.split(',')
    182 content.each do |val|
    183 val.gsub!('&amp;', '&') if val.include?('&amp;')
    184 id_industry = Industry.find_by name: val.strip
    185 id_industry = id_industry.blank? ? Industry.create!(name: val.strip).id : id_industry.id
    186 IndustryJob.create!(industry_id: id_industry, job_id: id_job)
    187 end
    188 end
    189
    190 def self.make_foreign_cities_table(data, id_job)
    191 cities = data.split(',')
    • Hoang Phuc Do @phucdh commented Jul 28, 2020
      Master

      Các method này đều là public, nghĩa là bất cứ class ngoài nào cũng có thể gọi những method này

      Nếu class ngoài truyền vào giá trị nil cho những tham số data, id_job, thì những method này rất dễ bị lỗi

      Có thể kiểm tra giá trị đầu vào khi thực thi method không ?

      Các method này đều là public, nghĩa là bất cứ class ngoài nào cũng có thể gọi những method này Nếu class ngoài truyền vào giá trị `nil` cho những tham số `data, id_job`, thì những method này rất dễ bị lỗi Có thể kiểm tra giá trị đầu vào khi thực thi method không ?
    • Ngô Trung Hưng @hungnt commented Jul 28, 2020
      Master

      Dạ được anh

      Dạ được anh
    • Ngô Trung Hưng @hungnt

      changed this line in version 3 of the diff

      Jul 28, 2020

      changed this line in version 3 of the diff

      changed this line in [version 3 of the diff](https://gitlab.zigexn.vn/hungnt/venjob_nth/merge_requests/2/diffs?diff_id=4864&start_sha=5683fa1122dbbd23778eab566fda50595be10b7d#2081481459f47e5e9b1f786abd02f6662de28d8d_191_203)
      Toggle commit list
    Please register or sign in to reply
  • Hoang Phuc Do
    @phucdh started a discussion on an old version of the diff Jul 28, 2020
    Last updated by Ngô Trung Hưng Jul 28, 2020
    lib/src/interface_web.rb 0 → 100644
    82 company.name = name.strip
    83 company.address = address
    84 company.short_description = desc
    85 end
    86 puts name
    87 end
    88 rescue StandardError => e
    89 puts e
    90 end
    91 end
    92 end
    93
    94 def self.add_data(name, company_name, city_name, created_date, expiration_date, salary, industry_name, description, level, exprience)
    95 begin
    96 id_company = Company.find_by name: company_name
    97 id_company = id_company.present? ? id_company.id : 1
    • Hoang Phuc Do @phucdh commented Jul 28, 2020
      Master

      1 là Magic number

      `1` là Magic number
    • Ngô Trung Hưng @hungnt commented Jul 28, 2020
      Master

      e fix ngay

      e fix ngay
    • Ngô Trung Hưng @hungnt

      changed this line in version 3 of the diff

      Jul 28, 2020

      changed this line in version 3 of the diff

      changed this line in [version 3 of the diff](https://gitlab.zigexn.vn/hungnt/venjob_nth/merge_requests/2/diffs?diff_id=4864&start_sha=5683fa1122dbbd23778eab566fda50595be10b7d#2081481459f47e5e9b1f786abd02f6662de28d8d_97_97)
      Toggle commit list
    Please register or sign in to reply
  • Hoang Phuc Do
    @phucdh started a discussion on an old version of the diff Jul 28, 2020
    Last updated by Ngô Trung Hưng Jul 28, 2020
    lib/src/interface_web.rb 0 → 100644
    194 id_cities = id_cities.blank? ? City.create!(name: city.strip, area: 1).id : id_cities.id
    195 CityJob.create!(job_id: id_job, city_id: id_cities)
    196 end
    197 end
    198
    199 def self.make_data
    200 puts 'Please wait for crawl jobs data! . . .'
    201 link_crawl = link_job_and_companies
    202 arr_link = []
    203 link_crawl[1].each do |val|
    204 break if @@stop_crawl == val
    205 arr_link << val
    206 end
    207 arr_link.reverse!.each_with_index do |path, i|
    208 page = Nokogiri::HTML(URI.open(URI.parse(URI.escape(path))))
    209 if !page.search('.item-blue .detail-box:nth-child(1) ul li:nth-child(1) p')[0].nil?
    • Hoang Phuc Do @phucdh commented Jul 28, 2020
      Master

      Có thể thay thế thành như này được không ?

      if page.search('.item-blue .detail-box:nth-child(1) ul li:nth-child(1) p')[0].present?
      
      Có thể thay thế thành như này được không ? ```ruby if page.search('.item-blue .detail-box:nth-child(1) ul li:nth-child(1) p')[0].present? ```
    • Ngô Trung Hưng @hungnt commented Jul 28, 2020
      Master

      dạ được

      dạ được
    • Ngô Trung Hưng @hungnt

      changed this line in version 3 of the diff

      Jul 28, 2020

      changed this line in version 3 of the diff

      changed this line in [version 3 of the diff](https://gitlab.zigexn.vn/hungnt/venjob_nth/merge_requests/2/diffs?diff_id=4864&start_sha=5683fa1122dbbd23778eab566fda50595be10b7d#2081481459f47e5e9b1f786abd02f6662de28d8d_209_228)
      Toggle commit list
    Please register or sign in to reply
  • Hoang Phuc Do
    @phucdh started a discussion on an old version of the diff Jul 28, 2020
    Last updated by Ngô Trung Hưng Jul 28, 2020
    lib/src/interface_web.rb 0 → 100644
    178 end
    179
    180 def self.make_foreign_industries_table(data, id_job)
    181 content = data.split(',')
    182 content.each do |val|
    183 val.gsub!('&amp;', '&') if val.include?('&amp;')
    184 id_industry = Industry.find_by name: val.strip
    185 id_industry = id_industry.blank? ? Industry.create!(name: val.strip).id : id_industry.id
    186 IndustryJob.create!(industry_id: id_industry, job_id: id_job)
    187 end
    188 end
    189
    190 def self.make_foreign_cities_table(data, id_job)
    191 cities = data.split(',')
    192 cities.each do |city|
    193 id_cities = City.find_by name: city.strip
    • Hoang Phuc Do @phucdh commented Jul 28, 2020
      Master

      Method #find_by trong Rails trả về một ActiveRecord object

      Đặt tên biến id_cities là sai ý nghĩa của giá trị trả về

      • Thể hiện biến chứa nhiều giá trị
      • Là danh sách id của nhiều city ?
      Method `#find_by` trong Rails trả về một ActiveRecord object Đặt tên biến `id_cities` là sai ý nghĩa của giá trị trả về - Thể hiện biến chứa nhiều giá trị - Là danh sách id của nhiều city ?
    • Ngô Trung Hưng @hungnt

      changed this line in version 3 of the diff

      Jul 28, 2020

      changed this line in version 3 of the diff

      changed this line in [version 3 of the diff](https://gitlab.zigexn.vn/hungnt/venjob_nth/merge_requests/2/diffs?diff_id=4864&start_sha=5683fa1122dbbd23778eab566fda50595be10b7d#2081481459f47e5e9b1f786abd02f6662de28d8d_193_203)
      Toggle commit list
    Please register or sign in to reply
  • Hoang Phuc Do
    @phucdh started a discussion on an old version of the diff Jul 28, 2020
    Last updated by Ngô Trung Hưng Jul 28, 2020
    lib/src/interface_web.rb 0 → 100644
    198
    199 def self.make_data
    200 puts 'Please wait for crawl jobs data! . . .'
    201 link_crawl = link_job_and_companies
    202 arr_link = []
    203 link_crawl[1].each do |val|
    204 break if @@stop_crawl == val
    205 arr_link << val
    206 end
    207 arr_link.reverse!.each_with_index do |path, i|
    208 page = Nokogiri::HTML(URI.open(URI.parse(URI.escape(path))))
    209 if !page.search('.item-blue .detail-box:nth-child(1) ul li:nth-child(1) p')[0].nil?
    210 crawl_data_jobs_interface_1(page)
    211 elsif page.search('section .template-200').text.present?
    212 crawl_data_jobs_interface_2(page)
    213 elsif page.search('.DetailJobNew ul li').size == 10 && !page.search('.right-col ul li').text.include?('Độ tuổi')
    • Hoang Phuc Do @phucdh commented Jul 28, 2020
      Master

      10 là một magic number

      `10` là một magic number
    • Ngô Trung Hưng @hungnt

      changed this line in version 3 of the diff

      Jul 28, 2020

      changed this line in version 3 of the diff

      changed this line in [version 3 of the diff](https://gitlab.zigexn.vn/hungnt/venjob_nth/merge_requests/2/diffs?diff_id=4864&start_sha=5683fa1122dbbd23778eab566fda50595be10b7d#2081481459f47e5e9b1f786abd02f6662de28d8d_213_232)
      Toggle commit list
    Please register or sign in to reply
  • Ngô Trung Hưng @hungnt

    added 1 commit

    • 814a4af9 - fix part 2

    Compare with previous version

    Jul 28, 2020

    added 1 commit

    • 814a4af9 - fix part 2

    Compare with previous version

    added 1 commit * 814a4af9 - fix part 2 [Compare with previous version](https://gitlab.zigexn.vn/hungnt/venjob_nth/merge_requests/2/diffs?diff_id=4864&start_sha=5683fa1122dbbd23778eab566fda50595be10b7d)
    Toggle commit list
  • Hoang Phuc Do
    @phucdh started a discussion on an old version of the diff Jul 28, 2020
    Last updated by Ngô Trung Hưng Jul 29, 2020
    lib/src/interface_web.rb 0 → 100644
    7 COMPANY_SECURITY = 1
    8 SIZE_LI_INTERFACE_5 = 10
    9 INTERNATIONAL = 0
    10 DOMESTIC = 1
    11 RANGE = 69
    12
    13 def crawl_link(page)
    14 puts "Crawling link on page...\nPLease wait...\n"
    15 data = []
    16 website_companies = []
    17 website_jobs = []
    18
    19 file = File.readlines('tmp/link.txt', 'r') if File.exist?('tmp/link.txt')
    20 @@stop_crawl = file.blank? ? '' : file.join
    21 page.times do |i|
    22 page = Nokogiri::HTML(URI.open("https://careerbuilder.vn/viec-lam/tat-ca-viec-lam-trang-#{i + 1}-vi.html"))
    • Hoang Phuc Do @phucdh commented Jul 28, 2020
      Master

      variable name này bị trùng với tên tham số của method

      variable name này bị trùng với tên tham số của method
    • Ngô Trung Hưng @hungnt

      changed this line in version 6 of the diff

      Jul 29, 2020

      changed this line in version 6 of the diff

      changed this line in [version 6 of the diff](https://gitlab.zigexn.vn/hungnt/venjob_nth/merge_requests/2/diffs?diff_id=4882&start_sha=8fecd429de43ac36f58321166eb4084a2eda70b9#2081481459f47e5e9b1f786abd02f6662de28d8d_23_32)
      Toggle commit list
    Please register or sign in to reply
  • Hoang Phuc Do
    @phucdh started a discussion on an old version of the diff Jul 28, 2020
    Last updated by Ngô Trung Hưng Jul 29, 2020
    lib/src/interface_web.rb 0 → 100644
    5 # Crawler data
    6 class InterfaceWeb
    7 COMPANY_SECURITY = 1
    8 SIZE_LI_INTERFACE_5 = 10
    9 INTERNATIONAL = 0
    10 DOMESTIC = 1
    11 RANGE = 69
    12
    13 def crawl_link(page)
    14 puts "Crawling link on page...\nPLease wait...\n"
    15 data = []
    16 website_companies = []
    17 website_jobs = []
    18
    19 file = File.readlines('tmp/link.txt', 'r') if File.exist?('tmp/link.txt')
    20 @@stop_crawl = file.blank? ? '' : file.join
    • Hoang Phuc Do @phucdh commented Jul 28, 2020
      Master

      Có thể chuyển class variable này thành một method riêng được không ?

      Có thể chuyển class variable này thành một method riêng được không ?
    • Ngô Trung Hưng @hungnt

      changed this line in version 6 of the diff

      Jul 29, 2020

      changed this line in version 6 of the diff

      changed this line in [version 6 of the diff](https://gitlab.zigexn.vn/hungnt/venjob_nth/merge_requests/2/diffs?diff_id=4882&start_sha=8fecd429de43ac36f58321166eb4084a2eda70b9#2081481459f47e5e9b1f786abd02f6662de28d8d_21_32)
      Toggle commit list
    Please register or sign in to reply
  • Hoang Phuc Do
    @phucdh started a discussion on an old version of the diff Jul 28, 2020
    Last updated by Ngô Trung Hưng Jul 28, 2020
    lib/src/interface_web.rb 0 → 100644
    43 end
    44
    45 def craw_data_cities
    46 page = Nokogiri::HTML(URI.open('https://careerbuilder.vn/viec-lam/tat-ca-viec-lam-vi.html'))
    47 puts "Crawling data location... \n. \n. \n."
    48 data_list_cities = []
    49 data = page.search('#location option')
    50 list_cities = data.to_s.split('</option>')
    51 list_cities.each do |x|
    52 data_list_cities << x.gsub(/(^<[\w\D]*>)/, '').gsub(/\n/, '').rstrip
    53 end
    54 puts 'Save data to database...'
    55 data_list_cities.each_with_index do |val, index|
    56 area = index > RANGE ? INTERNATIONAL : DOMESTIC
    57 City.find_or_create_by(name: val) do |city|
    58 city.name = val
    • Hoang Phuc Do @phucdh commented Jul 28, 2020
      Master

      Việc gán lại thuộc tính name ở đây nhằm mục đích gì ?

      Việc gán lại thuộc tính `name` ở đây nhằm mục đích gì ?
    • Ngô Trung Hưng @hungnt commented Jul 28, 2020
      Master

      em làm bị dư

      em làm bị dư
    • Ngô Trung Hưng @hungnt

      changed this line in version 4 of the diff

      Jul 28, 2020

      changed this line in version 4 of the diff

      changed this line in [version 4 of the diff](https://gitlab.zigexn.vn/hungnt/venjob_nth/merge_requests/2/diffs?diff_id=4868&start_sha=814a4af94bdf42665f1b8f9187e1ebc438ba614c#2081481459f47e5e9b1f786abd02f6662de28d8d_58_58)
      Toggle commit list
    Please register or sign in to reply
  • Hoang Phuc Do
    @phucdh started a discussion on an old version of the diff Jul 28, 2020
    Last updated by Ngô Trung Hưng Jul 28, 2020
    lib/src/interface_web.rb 0 → 100644
    70 address = ''
    71 desc = ''
    72 company_name = page.search('.company-info .info .content .name').text
    73 if company_name.blank?
    74 name = page.search('.section-page #cp_company_name').text.strip
    75 address = page.search('.section-page .cp_basic_info_details ul li:nth-child(1)').text
    76 desc = page.search('.cp_aboutus_item .content_fck').text
    77 else
    78 name = company_name.strip
    79 address = page.search('.company-info .info .content p:nth-child(3)').text
    80 desc = page.search('.main-about-us .content').text
    81 end
    82 begin
    83 if name.present? && address.present? && desc.present?
    84 Company.find_or_create_by(name: name.strip) do |company|
    85 company.name = name.strip
    • Hoang Phuc Do @phucdh commented Jul 28, 2020
      Master

      Việc gán lại thuộc tính name ở đây nhằm mục đích gì ?

      Việc gán lại thuộc tính `name` ở đây nhằm mục đích gì ?
    • Ngô Trung Hưng @hungnt

      changed this line in version 4 of the diff

      Jul 28, 2020

      changed this line in version 4 of the diff

      changed this line in [version 4 of the diff](https://gitlab.zigexn.vn/hungnt/venjob_nth/merge_requests/2/diffs?diff_id=4868&start_sha=814a4af94bdf42665f1b8f9187e1ebc438ba614c#2081481459f47e5e9b1f786abd02f6662de28d8d_85_84)
      Toggle commit list
    Please register or sign in to reply
  • Hoang Phuc Do
    @phucdh started a discussion on an old version of the diff Jul 28, 2020
    Last updated by Ngô Trung Hưng Jul 28, 2020
    lib/src/interface_web.rb 0 → 100644
    84 Company.find_or_create_by(name: name.strip) do |company|
    85 company.name = name.strip
    86 company.address = address
    87 company.short_description = desc
    88 end
    89 puts name
    90 end
    91 rescue StandardError => e
    92 puts e
    93 end
    94 end
    95 end
    96
    97 private
    98
    99 def add_data(name, company_name, city_name, created_date, expiration_date, salary, industry_name, description, level, exprience)
    • Hoang Phuc Do @phucdh commented Jul 28, 2020
      Master

      Method này có quá nhiều tham số, cách đơn giản nhất để xử lý là chuyển các tham số này thành dạng Hash

      Method này có quá nhiều tham số, cách đơn giản nhất để xử lý là chuyển các tham số này thành dạng Hash
    • Ngô Trung Hưng @hungnt commented Jul 28, 2020
      Master

      dạ anh

      dạ anh
    • Ngô Trung Hưng @hungnt

      changed this line in version 4 of the diff

      Jul 28, 2020

      changed this line in version 4 of the diff

      changed this line in [version 4 of the diff](https://gitlab.zigexn.vn/hungnt/venjob_nth/merge_requests/2/diffs?diff_id=4868&start_sha=814a4af94bdf42665f1b8f9187e1ebc438ba614c#2081481459f47e5e9b1f786abd02f6662de28d8d_99_97)
      Toggle commit list
    Please register or sign in to reply
  • Hoang Phuc Do
    @phucdh started a discussion on an old version of the diff Jul 28, 2020
    Last updated by Ngô Trung Hưng Jul 28, 2020
    lib/src/interface_web.rb 0 → 100644
    126 city_name = location.join(',')
    127 created_date = page.search('.item-blue .detail-box:nth-child(1) ul li:nth-child(1) p')[0].text
    128 expiration_date = page.search('.item-blue .detail-box ul li:last')[1].text.delete!("[\n,\t,\r]").split(' ').last
    129 salary = page.search('.item-blue .detail-box:nth-child(1) ul li:nth-child(1) p')[1].text
    130 industries = page.search('.item-blue .detail-box:nth-child(1) ul li:nth-child(2) a').text
    131 industries = industries.delete!("[\n,\t,\r]").split(' ').select(&:present?)
    132 industry_name = industries.join(',')
    133 description = page.search('.tabs .tab-content .detail-row:nth-child(n)').to_s
    134 get_level = page.search('.item-blue .detail-box:last ul li:nth-child(3)').text.delete!("[\n,\t,\r]").lstrip.split('Cấp bậc')
    135 get_level = get_level[1].to_s.strip
    136 if get_level.blank?
    137 g_level = page.search('.item-blue .detail-box:last ul li:nth-child(2)').text.delete!("[\n,\t,\r]").lstrip.split('Cấp bậc')
    138 level = g_level[1].to_s.strip
    139 else
    140 g_level = get_level
    141 level = g_level
    • Hoang Phuc Do @phucdh commented Jul 28, 2020
      Master

      Đơn giản hóa việc gán biến:

      level = get_level
      Đơn giản hóa việc gán biến: ```ruby level = get_level ```
    • Ngô Trung Hưng @hungnt

      changed this line in version 4 of the diff

      Jul 28, 2020

      changed this line in version 4 of the diff

      changed this line in [version 4 of the diff](https://gitlab.zigexn.vn/hungnt/venjob_nth/merge_requests/2/diffs?diff_id=4868&start_sha=814a4af94bdf42665f1b8f9187e1ebc438ba614c#2081481459f47e5e9b1f786abd02f6662de28d8d_141_137)
      Toggle commit list
    Please register or sign in to reply
  • Hoang Phuc Do
    @phucdh started a discussion on an old version of the diff Jul 28, 2020
    Last updated by Ngô Trung Hưng Jul 28, 2020
    lib/src/interface_web.rb 0 → 100644
    174
    175 def crawl_data_jobs_interface_5(page)
    176 name = page.search('.info-company h1').text
    177 company_name = page.search('.info-company .text-job h2').text
    178 city_name = page.search('.DetailJobNew ul li:nth-child(1) a').text
    179 created_date = ''
    180 expiration_date = page.search('.DetailJobNew li:nth-child(9) span').text.strip
    181 salary = page.search('.DetailJobNew li:nth-child(3) span').text.strip
    182 industry_name = page.search('.DetailJobNew li:nth-child(2) span').text.strip
    183 description = page.search('.left-col .detail-row')
    184 level = page.search('.DetailJobNew ul li:nth-child(6) span').text.strip
    185 exprience = page.search('.DetailJobNew li:nth-child(5) span').text.strip
    186 add_data(name, company_name, city_name, created_date, expiration_date, salary, industry_name, description, level, exprience)
    187 end
    188
    189 private
    • Hoang Phuc Do @phucdh commented Jul 28, 2020
      Master

      Tất cả public method để ở đầu

      private method để ở dưới

      Độ ưu tiên sắp xếp như sau:

      public > private > protected

      public method không cần khai báo public

      Các private method khai báo sau private

      class A
        def a
          # Code
        end
      
        private
        def b
          # Code
        end
      
        def c
          # Code
        end
      end
      Edited Jul 28, 2020 by Hoang Phuc Do
      Tất cả public method để ở đầu private method để ở dưới Độ ưu tiên sắp xếp như sau: public > private > protected public method không cần khai báo `public` Các private method khai báo sau `private` ```ruby class A def a # Code end private def b # Code end def c # Code end end ```
    • Ngô Trung Hưng @hungnt

      changed this line in version 4 of the diff

      Jul 28, 2020

      changed this line in version 4 of the diff

      changed this line in [version 4 of the diff](https://gitlab.zigexn.vn/hungnt/venjob_nth/merge_requests/2/diffs?diff_id=4868&start_sha=814a4af94bdf42665f1b8f9187e1ebc438ba614c#2081481459f47e5e9b1f786abd02f6662de28d8d_189_183)
      Toggle commit list
    Please register or sign in to reply
  • Ngô Trung Hưng @hungnt

    added 1 commit

    • 716b0bd9 - fix -part 3

    Compare with previous version

    Jul 28, 2020

    added 1 commit

    • 716b0bd9 - fix -part 3

    Compare with previous version

    added 1 commit * 716b0bd9 - fix -part 3 [Compare with previous version](https://gitlab.zigexn.vn/hungnt/venjob_nth/merge_requests/2/diffs?diff_id=4868&start_sha=814a4af94bdf42665f1b8f9187e1ebc438ba614c)
    Toggle commit list
  • Hoang Phuc Do
    @phucdh started a discussion on an old version of the diff Jul 28, 2020
    Last updated by Ngô Trung Hưng Jul 28, 2020
    lib/src/interface_web.rb 0 → 100644
    1 # frozen_string_literal: true
    2
    3 require 'open-uri'
    4
    5 # Crawler data
    6 class InterfaceWeb
    • Hoang Phuc Do @phucdh commented Jul 28, 2020
      Master

      Tên class có thể được đặt là Crawler

      Tên class có thể được đặt là `Crawler`
    • Ngô Trung Hưng @hungnt

      changed this line in version 5 of the diff

      Jul 28, 2020

      changed this line in version 5 of the diff

      changed this line in [version 5 of the diff](https://gitlab.zigexn.vn/hungnt/venjob_nth/merge_requests/2/diffs?diff_id=4871&start_sha=716b0bd9f0f7ae3b936f5e03900a916cf028064b#2081481459f47e5e9b1f786abd02f6662de28d8d_6_6)
      Toggle commit list
    • Ngô Trung Hưng @hungnt commented Jul 28, 2020
      Master

      dạ anh

      dạ anh
    Please register or sign in to reply
  • Hoang Phuc Do
    @phucdh started a discussion on an old version of the diff Jul 28, 2020
    Last updated by Ngô Trung Hưng Jul 29, 2020
    lib/src/interface_web.rb 0 → 100644
    93 end
    94
    95 private
    96
    97 def add_data(data)
    98 id_company = Company.find_by name: data[:company_name]
    99 id_company = id_company.present? ? id_company.id : COMPANY_SECURITY
    100 id_job = Job.create!(name: data[:name],
    101 company_id: id_company,
    102 level: data[:level],
    103 experience: data[:exprience],
    104 salary: data[:salary],
    105 create_date: data[:created_date],
    106 expiration_date: data[:expiration_date],
    107 description: data[:description])
    108 make_foreign_industries_table(data[:industry_name], id_job.id)
    • Hoang Phuc Do @phucdh commented Jul 28, 2020
      Master

      Dựa theo cách đặt tên hàm này là make_foreign_industries_table, thể hiện rằng nhiệm vụ của nó là tạo table foreign_industries, như vậy có đúng không ?

      Nhiệm vụ của hàm này là gì ?

      Dựa theo cách đặt tên hàm này là `make_foreign_industries_table`, thể hiện rằng nhiệm vụ của nó là tạo table `foreign_industries`, như vậy có đúng không ? Nhiệm vụ của hàm này là gì ?
    • Ngô Trung Hưng @hungnt

      changed this line in version 6 of the diff

      Jul 29, 2020

      changed this line in version 6 of the diff

      changed this line in [version 6 of the diff](https://gitlab.zigexn.vn/hungnt/venjob_nth/merge_requests/2/diffs?diff_id=4882&start_sha=8fecd429de43ac36f58321166eb4084a2eda70b9#2081481459f47e5e9b1f786abd02f6662de28d8d_130_100)
      Toggle commit list
    Please register or sign in to reply
  • Hoang Phuc Do
    @phucdh started a discussion on an old version of the diff Jul 28, 2020
    Last updated by Ngô Trung Hưng Jul 29, 2020
    lib/src/interface_web.rb 0 → 100644
    94
    95 private
    96
    97 def add_data(data)
    98 id_company = Company.find_by name: data[:company_name]
    99 id_company = id_company.present? ? id_company.id : COMPANY_SECURITY
    100 id_job = Job.create!(name: data[:name],
    101 company_id: id_company,
    102 level: data[:level],
    103 experience: data[:exprience],
    104 salary: data[:salary],
    105 create_date: data[:created_date],
    106 expiration_date: data[:expiration_date],
    107 description: data[:description])
    108 make_foreign_industries_table(data[:industry_name], id_job.id)
    109 make_foreign_cities_table(data[:city_name], id_job.id)
    • Hoang Phuc Do @phucdh commented Jul 28, 2020
      Master

      Dựa theo cách đặt tên hàm này là make_foreign_cities_table, thể hiện rằng nhiệm vụ của nó là tạo table foreign_cities, như vậy có đúng không ?

      Nhiệm vụ của hàm này là gì ?

      Dựa theo cách đặt tên hàm này là `make_foreign_cities_table`, thể hiện rằng nhiệm vụ của nó là tạo table `foreign_cities`, như vậy có đúng không ? Nhiệm vụ của hàm này là gì ?
    • Ngô Trung Hưng @hungnt commented Jul 28, 2020
      Master

      Dạ nhận dữ liệu từ các method crawl_data_jobs_interface_1, crawl_data_jobs_interface_2, crawl_data_jobs_interface_3 và lưu vào db

      Dạ nhận dữ liệu từ các method crawl_data_jobs_interface_1, crawl_data_jobs_interface_2, crawl_data_jobs_interface_3 và lưu vào db
    • Ngô Trung Hưng @hungnt

      changed this line in version 6 of the diff

      Jul 29, 2020

      changed this line in version 6 of the diff

      changed this line in [version 6 of the diff](https://gitlab.zigexn.vn/hungnt/venjob_nth/merge_requests/2/diffs?diff_id=4882&start_sha=8fecd429de43ac36f58321166eb4084a2eda70b9#2081481459f47e5e9b1f786abd02f6662de28d8d_131_100)
      Toggle commit list
    Please register or sign in to reply
  • Hoang Phuc Do
    @phucdh started a discussion on an old version of the diff Jul 28, 2020
    Last updated by Ngô Trung Hưng Jul 29, 2020
    lib/src/interface_web.rb 0 → 100644
    99 id_company = id_company.present? ? id_company.id : COMPANY_SECURITY
    100 id_job = Job.create!(name: data[:name],
    101 company_id: id_company,
    102 level: data[:level],
    103 experience: data[:exprience],
    104 salary: data[:salary],
    105 create_date: data[:created_date],
    106 expiration_date: data[:expiration_date],
    107 description: data[:description])
    108 make_foreign_industries_table(data[:industry_name], id_job.id)
    109 make_foreign_cities_table(data[:city_name], id_job.id)
    110 rescue StandardError => e
    111 puts e
    112 end
    113
    114 def crawl_data_jobs_interface_1(page)
    • Hoang Phuc Do @phucdh commented Jul 28, 2020
      Master

      Các method crawl_data_jobs_interface_1, crawl_data_jobs_interface_2, crawl_data_jobs_interface_5 đều có chung flow như sau:

      1. Nhận dữ liệu đầu vào từ page
      2. Xử lý để lấy các dữ liệu name, company_name, city_name, created_date, ...
      3. Lưu dữ liệu đã xử lý vào DB

      Trong các bước trên, bước 2 là thay đổi, còn bước 1 và 3 có cách xử lý tương tự nhau.

      Áp dụng Template Method pattern để tách các hàm trên thành một class riêng và flow xử lý chung trong một base class

      Các method `crawl_data_jobs_interface_1`, `crawl_data_jobs_interface_2`, `crawl_data_jobs_interface_5` đều có chung flow như sau: 1. Nhận dữ liệu đầu vào từ `page` 2. Xử lý để lấy các dữ liệu name, company_name, city_name, created_date, ... 3. Lưu dữ liệu đã xử lý vào DB Trong các bước trên, bước 2 là thay đổi, còn bước 1 và 3 có cách xử lý tương tự nhau. Áp dụng Template Method pattern để tách các hàm trên thành một class riêng và flow xử lý chung trong một base class
    • Ngô Trung Hưng @hungnt

      changed this line in version 6 of the diff

      Jul 29, 2020

      changed this line in version 6 of the diff

      changed this line in [version 6 of the diff](https://gitlab.zigexn.vn/hungnt/venjob_nth/merge_requests/2/diffs?diff_id=4882&start_sha=8fecd429de43ac36f58321166eb4084a2eda70b9#2081481459f47e5e9b1f786abd02f6662de28d8d_136_102)
      Toggle commit list
    Please register or sign in to reply
  • Ngô Trung Hưng @hungnt

    added 1 commit

    • 8fecd429 - fix -part 4

    Compare with previous version

    Jul 28, 2020

    added 1 commit

    • 8fecd429 - fix -part 4

    Compare with previous version

    added 1 commit * 8fecd429 - fix -part 4 [Compare with previous version](https://gitlab.zigexn.vn/hungnt/venjob_nth/merge_requests/2/diffs?diff_id=4871&start_sha=716b0bd9f0f7ae3b936f5e03900a916cf028064b)
    Toggle commit list
  • Hoang Phuc Do
    @phucdh started a discussion on an old version of the diff Jul 28, 2020
    Last updated by Ngô Trung Hưng Jul 29, 2020
    lib/src/interface_web.rb 0 → 100644
    1 # frozen_string_literal: true
    2
    3 require 'open-uri'
    4
    5 # Crawler data
    6 class Crawler
    7 COMPANY_SECURITY = 1
    8 NUMBER_LINK = 1
    9 SIZE_LI_INTERFACE_5 = 10
    10 INTERNATIONAL = 0
    • Hoang Phuc Do @phucdh commented Jul 28, 2020
      Master

      INTERNATIONAL, DOMESTIC, RANGE thuộc phần xử lý của model City

      => Chuyển các giá trị này vào app/models/city.rb

      Tham khảo cách sử dụng enum trong Rails

      `INTERNATIONAL, DOMESTIC, RANGE` thuộc phần xử lý của model City => Chuyển các giá trị này vào `app/models/city.rb` Tham khảo cách sử dụng enum trong Rails
    • Ngô Trung Hưng @hungnt

      changed this line in version 6 of the diff

      Jul 29, 2020

      changed this line in version 6 of the diff

      changed this line in [version 6 of the diff](https://gitlab.zigexn.vn/hungnt/venjob_nth/merge_requests/2/diffs?diff_id=4882&start_sha=8fecd429de43ac36f58321166eb4084a2eda70b9#2081481459f47e5e9b1f786abd02f6662de28d8d_10_10)
      Toggle commit list
    Please register or sign in to reply
  • Hoang Phuc Do
    @phucdh started a discussion on an old version of the diff Jul 28, 2020
    Last updated by Ngô Trung Hưng Jul 31, 2020
    lib/src/interface_web.rb 0 → 100644
    1 # frozen_string_literal: true
    2
    3 require 'open-uri'
    4
    5 # Crawler data
    6 class Crawler
    7 COMPANY_SECURITY = 1
    • Hoang Phuc Do @phucdh commented Jul 28, 2020
      Master

      COMPANY_SECURITY thuộc phần xử lý của model Company

      => Chuyển giá trị vào app/models/company.rb

      `COMPANY_SECURITY` thuộc phần xử lý của model Company => Chuyển giá trị vào `app/models/company.rb`
    • Ngô Trung Hưng @hungnt

      changed this line in version 7 of the diff

      Jul 31, 2020

      changed this line in version 7 of the diff

      changed this line in [version 7 of the diff](https://gitlab.zigexn.vn/hungnt/venjob_nth/merge_requests/2/diffs?diff_id=4892&start_sha=0ac0989ba71355e4e03286cc25e91468c1baa94e#2081481459f47e5e9b1f786abd02f6662de28d8d_7_0)
      Toggle commit list
    Please register or sign in to reply
  • Hoang Phuc Do
    @phucdh started a discussion on an old version of the diff Jul 28, 2020
    Last updated by Ngô Trung Hưng Jul 29, 2020
    lib/src/interface_web.rb 0 → 100644
    138 data[:name] = page.search('.apply-now-content .job-desc .title').text
    139 data[:company_name] = page.search('.apply-now-content .job-desc .job-company-name').text
    140 location = []
    141 length = page.search('.detail-box .map p a').size
    142 length.times do |n|
    143 location << page.search(".detail-box .map p a:nth-child(#{n + 1})").text
    144 end
    145 data[:city_name] = location.join(',')
    146 data[:created_date] = page.search('.item-blue .detail-box:nth-child(1) ul li:nth-child(1) p')[0].text
    147 data[:expiration_date] = page.search('.item-blue .detail-box ul li:last')[1].text.delete!("[\n,\t,\r]").split(' ').last
    148 data[:salary] = page.search('.item-blue .detail-box:nth-child(1) ul li:nth-child(1) p')[1].text
    149 industries = page.search('.item-blue .detail-box:nth-child(1) ul li:nth-child(2) a').text
    150 industries = industries.delete!("[\n,\t,\r]").split(' ').select(&:present?)
    151 data[:industry_name] = industries.join(',')
    152 data[:description] = page.search('.tabs .tab-content .detail-row:nth-child(n)').to_s
    153 get_level = page.search('.item-blue .detail-box:last ul li:nth-child(3)').text.delete!("[\n,\t,\r]").lstrip.split('Cấp bậc')
    • Hoang Phuc Do @phucdh commented Jul 28, 2020
      Master

      Tại sao lại sử dụng delete! mà không dùng delete ?

      Tại sao lại sử dụng `delete!` mà không dùng `delete` ?
    • Ngô Trung Hưng @hungnt

      changed this line in version 6 of the diff

      Jul 29, 2020

      changed this line in version 6 of the diff

      changed this line in [version 6 of the diff](https://gitlab.zigexn.vn/hungnt/venjob_nth/merge_requests/2/diffs?diff_id=4882&start_sha=8fecd429de43ac36f58321166eb4084a2eda70b9#2081481459f47e5e9b1f786abd02f6662de28d8d_153_102)
      Toggle commit list
    Please register or sign in to reply
  • Ngô Trung Hưng @hungnt

    added 1 commit

    • 0ac0989b - fix code

    Compare with previous version

    Jul 29, 2020

    added 1 commit

    • 0ac0989b - fix code

    Compare with previous version

    added 1 commit * 0ac0989b - fix code [Compare with previous version](https://gitlab.zigexn.vn/hungnt/venjob_nth/merge_requests/2/diffs?diff_id=4882&start_sha=8fecd429de43ac36f58321166eb4084a2eda70b9)
    Toggle commit list
  • Hoang Phuc Do
    @phucdh started a discussion on an old version of the diff Jul 29, 2020
    Last updated by Ngô Trung Hưng Jul 31, 2020
    lib/src/interface_web.rb 0 → 100644
    10
    11 def path_to_first_link
    12 Rails.root.join('tmp', 'link.txt')
    13 end
    14
    15 def logger
    16 @logger ||= Logger.new(Rails.root.join('log', 'crawler.log'))
    17 end
    18
    19 def stop_crawler
    20 file = File.readlines(path_to_first_link, 'r') if File.exist?(path_to_first_link)
    21 file.blank? ? '' : file.join
    22 end
    23
    24 def safe_link(url)
    25 Nokogiri::HTML(URI.open(URI.parse(URI.escape(url))))
    • Hoang Phuc Do @phucdh commented Jul 29, 2020
      Master

      URI.escape(url) => CGI.escape(url)

      `URI.escape(url)` => `CGI.escape(url)`
    • Ngô Trung Hưng @hungnt

      changed this line in version 7 of the diff

      Jul 31, 2020

      changed this line in version 7 of the diff

      changed this line in [version 7 of the diff](https://gitlab.zigexn.vn/hungnt/venjob_nth/merge_requests/2/diffs?diff_id=4892&start_sha=0ac0989ba71355e4e03286cc25e91468c1baa94e#2081481459f47e5e9b1f786abd02f6662de28d8d_25_0)
      Toggle commit list
    Please register or sign in to reply
  • Hoang Phuc Do
    @phucdh started a discussion on an old version of the diff Jul 29, 2020
    Last updated by Ngô Trung Hưng Jul 31, 2020
    lib/src/interface_web.rb 0 → 100644
    1 # frozen_string_literal: true
    2
    3 require 'open-uri'
    4
    5 # Crawler data
    6 class Crawler
    • Hoang Phuc Do @phucdh commented Jul 29, 2020
      Master

      Tên file và tên class không tương ứng với nhau

      Tên file và tên class không tương ứng với nhau
    • Ngô Trung Hưng @hungnt commented Jul 29, 2020
      Master

      dạ anh, e fix ngay

      dạ anh, e fix ngay
    • Ngô Trung Hưng @hungnt

      changed this line in version 7 of the diff

      Jul 31, 2020

      changed this line in version 7 of the diff

      changed this line in [version 7 of the diff](https://gitlab.zigexn.vn/hungnt/venjob_nth/merge_requests/2/diffs?diff_id=4892&start_sha=0ac0989ba71355e4e03286cc25e91468c1baa94e#2081481459f47e5e9b1f786abd02f6662de28d8d_6_0)
      Toggle commit list
    Please register or sign in to reply
  • Ngô Trung Hưng @hungnt

    added 1 commit

    • a0abd223 - use Template Method Pattern

    Compare with previous version

    Jul 31, 2020

    added 1 commit

    • a0abd223 - use Template Method Pattern

    Compare with previous version

    added 1 commit * a0abd223 - use Template Method Pattern [Compare with previous version](https://gitlab.zigexn.vn/hungnt/venjob_nth/merge_requests/2/diffs?diff_id=4892&start_sha=0ac0989ba71355e4e03286cc25e91468c1baa94e)
    Toggle commit list
  • Ngô Trung Hưng @hungnt

    added 1 commit

    • 043ca43e - fix crawler

    Compare with previous version

    Jul 31, 2020

    added 1 commit

    • 043ca43e - fix crawler

    Compare with previous version

    added 1 commit * 043ca43e - fix crawler [Compare with previous version](https://gitlab.zigexn.vn/hungnt/venjob_nth/merge_requests/2/diffs?diff_id=4897&start_sha=a0abd223bff085b5edb847d74a54c000c14978ea)
    Toggle commit list
  • Ngô Trung Hưng @hungnt

    added 1 commit

    • d2e14dc8 - autoload_paths

    Compare with previous version

    Aug 03, 2020

    added 1 commit

    • d2e14dc8 - autoload_paths

    Compare with previous version

    added 1 commit * d2e14dc8 - autoload_paths [Compare with previous version](https://gitlab.zigexn.vn/hungnt/venjob_nth/merge_requests/2/diffs?diff_id=4902&start_sha=043ca43e326dcde33334a68e71a24fbd5747a34e)
    Toggle commit list
  • Ngô Trung Hưng @hungnt

    added 1 commit

    • 33c084b5 - fix rubocop

    Compare with previous version

    Aug 03, 2020

    added 1 commit

    • 33c084b5 - fix rubocop

    Compare with previous version

    added 1 commit * 33c084b5 - fix rubocop [Compare with previous version](https://gitlab.zigexn.vn/hungnt/venjob_nth/merge_requests/2/diffs?diff_id=4903&start_sha=d2e14dc8b37752e24a26835274b16081f294fb0f)
    Toggle commit list
  • Thanh Hung Pham
    @hungpt started a discussion on an old version of the diff Aug 03, 2020
    Last updated by Ngô Trung Hưng Aug 03, 2020
    lib/src/base/base.rb 0 → 100644
    1 # frozen_string_literal: true
    2
    3 require 'nokogiri'
    4 require 'open-uri'
    5 require 'logger'
    6
    7 # Crawler data
    8 class Base
    9 COMPANY_SECURITY = 1
    • Thanh Hung Pham @hungpt commented Aug 03, 2020

      @hungnt Anh thấy constant này em khai báo ở 2 chỗ.

      • Nên move vào Model Company thì hay hơn.
      • Đặt tên này ý nghĩa là gì em?
      @hungnt Anh thấy constant này em khai báo ở 2 chỗ. - Nên move vào Model Company thì hay hơn. - Đặt tên này ý nghĩa là gì em?
    • Ngô Trung Hưng @hungnt commented Aug 03, 2020
      Master

      E đặt bị trùng, e mới mang nó vào trong Company. Có 1 vài công ty họ không có public link công ty, họ chỉ để tên công ty là BẢO MẬT, và địa chỉ và thông tin của công ty đó sẽ được hiển thị trong phần mô tả công việc luôn. Nên khi gặp job có công ty name Bảo Mật thì nó sẽ lưu company_id = 1 tương ứng với dữ liệu trong DB.

      E đặt bị trùng, e mới mang nó vào trong Company. Có 1 vài công ty họ không có public link công ty, họ chỉ để tên công ty là BẢO MẬT, và địa chỉ và thông tin của công ty đó sẽ được hiển thị trong phần mô tả công việc luôn. Nên khi gặp job có công ty name Bảo Mật thì nó sẽ lưu `company_id = 1` tương ứng với dữ liệu trong DB.
    • Ngô Trung Hưng @hungnt

      changed this line in version 11 of the diff

      Aug 03, 2020

      changed this line in version 11 of the diff

      changed this line in [version 11 of the diff](https://gitlab.zigexn.vn/hungnt/venjob_nth/merge_requests/2/diffs?diff_id=4907&start_sha=33c084b58acc7ba121fbddce1df4987d6e3c8748#42748f6402cd5c441c28a62c28e5cc7ad34a9e4f_9_9)
      Toggle commit list
    Please register or sign in to reply
  • Hoang Phuc Do
    @phucdh started a discussion on an old version of the diff Aug 03, 2020
    Last updated by Ngô Trung Hưng Aug 03, 2020
    lib/src/base/base.rb 0 → 100644
    62 end
    63
    64 def fill_salary
    65 page.xpath('//ul//li[position()=1]//p')[1].text
    66 end
    67
    68 def fill_industry_name
    69 industries = page.xpath('//ul//li[position()=2]//p//a').map(&:text)
    70 industries.map(&:strip).join(',')
    71 end
    72
    73 def fill_description
    74 job[:description] = page.search('.tabs .tab-content .detail-row').to_s
    75 end
    76
    77 def check
    • Hoang Phuc Do @phucdh commented Aug 03, 2020
      Master

      Thêm dấu ? ở cuối tên method khi kiểu trả về là Boolean (True/False)

      https://medium.com/@sologoubalex/boolean-methods-in-ruby-94a2e907e5ea

      def includes_experience?
      Edited Aug 03, 2020 by Hoang Phuc Do
      Thêm dấu `?` ở cuối tên method khi kiểu trả về là Boolean (True/False) https://medium.com/@sologoubalex/boolean-methods-in-ruby-94a2e907e5ea ```ruby def includes_experience? ```
    • Ngô Trung Hưng @hungnt commented Aug 03, 2020
      Master

      Dạ anh

      Dạ anh
    • Ngô Trung Hưng @hungnt

      changed this line in version 11 of the diff

      Aug 03, 2020

      changed this line in version 11 of the diff

      changed this line in [version 11 of the diff](https://gitlab.zigexn.vn/hungnt/venjob_nth/merge_requests/2/diffs?diff_id=4907&start_sha=33c084b58acc7ba121fbddce1df4987d6e3c8748#42748f6402cd5c441c28a62c28e5cc7ad34a9e4f_77_75)
      Toggle commit list
    Please register or sign in to reply
  • Thanh Hung Pham
    @hungpt started a discussion on an old version of the diff Aug 03, 2020
    Last updated by Ngô Trung Hưng Aug 03, 2020
    lib/src/base/base.rb 0 → 100644
    62 end
    63
    64 def fill_salary
    65 page.xpath('//ul//li[position()=1]//p')[1].text
    66 end
    67
    68 def fill_industry_name
    69 industries = page.xpath('//ul//li[position()=2]//p//a').map(&:text)
    70 industries.map(&:strip).join(',')
    71 end
    72
    73 def fill_description
    74 job[:description] = page.search('.tabs .tab-content .detail-row').to_s
    75 end
    76
    77 def check
    • Thanh Hung Pham @hungpt commented Aug 03, 2020

      @hungnt Có 2 chỗ :

      • Mấy method mà trả về true hay false đặt tên với ? ở sau nha em check?

      Tham khảo rule: https://github.com/rubocop-hq/ruby-style-guide#bool-methods-qmark

      • Tên method có ý nghĩa gì em? check là check gì ?
      @hungnt Có 2 chỗ : - Mấy method mà trả về `true` hay `false` đặt tên với `?` ở sau nha em `check?` Tham khảo rule: https://github.com/rubocop-hq/ruby-style-guide#bool-methods-qmark - Tên method có ý nghĩa gì em? `check` là check gì ?
    • Ngô Trung Hưng @hungnt commented Aug 03, 2020
      Master

      dạ kiểm tra trong page.search('//ul//li').text có chứa 'kinh nghiệm' hay ko ạ. Để e sửa lại method

      dạ kiểm tra trong `page.search('//ul//li').text` có chứa 'kinh nghiệm' hay ko ạ. Để e sửa lại method
    • Ngô Trung Hưng @hungnt

      changed this line in version 11 of the diff

      Aug 03, 2020

      changed this line in version 11 of the diff

      changed this line in [version 11 of the diff](https://gitlab.zigexn.vn/hungnt/venjob_nth/merge_requests/2/diffs?diff_id=4907&start_sha=33c084b58acc7ba121fbddce1df4987d6e3c8748#42748f6402cd5c441c28a62c28e5cc7ad34a9e4f_77_75)
      Toggle commit list
    Please register or sign in to reply
  • Thanh Hung Pham
    @hungpt started a discussion on an old version of the diff Aug 03, 2020
    Last updated by Ngô Trung Hưng Aug 03, 2020
    lib/src/base/base.rb 0 → 100644
    68 def fill_industry_name
    69 industries = page.xpath('//ul//li[position()=2]//p//a').map(&:text)
    70 industries.map(&:strip).join(',')
    71 end
    72
    73 def fill_description
    74 job[:description] = page.search('.tabs .tab-content .detail-row').to_s
    75 end
    76
    77 def check
    78 noname = page.search('//ul//li').text
    79 noname.include?('Kinh nghiệm')
    80 end
    81
    82 def fill_lever
    83 if check
    • Thanh Hung Pham @hungpt commented Aug 03, 2020

      @hungnt đặt tên method cho dễ hiểu nha em? check là check gì ở đây.

      @hungnt đặt tên method cho dễ hiểu nha em? `check` là check gì ở đây.
    • Ngô Trung Hưng @hungnt commented Aug 03, 2020
      Master

      Dạ e sửa lại tên method ạ

      Dạ e sửa lại tên method ạ
    • Ngô Trung Hưng @hungnt

      changed this line in version 11 of the diff

      Aug 03, 2020

      changed this line in version 11 of the diff

      changed this line in [version 11 of the diff](https://gitlab.zigexn.vn/hungnt/venjob_nth/merge_requests/2/diffs?diff_id=4907&start_sha=33c084b58acc7ba121fbddce1df4987d6e3c8748#42748f6402cd5c441c28a62c28e5cc7ad34a9e4f_83_81)
      Toggle commit list
    Please register or sign in to reply
  • Thanh Hung Pham
    @hungpt started a discussion on an old version of the diff Aug 03, 2020
    Last updated by Ngô Trung Hưng Aug 04, 2020
    lib/src/crawler.rb 0 → 100644
    2
    3 require 'open-uri'
    4
    5 # Crawler data
    6 class Crawler
    7 COMPANY_SECURITY = 1
    8 RANGE = 69
    9
    10 attr_accessor :number_link
    11
    12 def initialize(number_link)
    13 @number_link = number_link
    14 end
    15
    16 def path_to_first_link
    17 Rails.root.join('tmp', 'link.txt')
    • Thanh Hung Pham @hungpt commented Aug 03, 2020

      @hungnt File link.txt nội dung là gì em?

      @hungnt File `link.txt` nội dung là gì em?
    • Ngô Trung Hưng @hungnt commented Aug 03, 2020
      Master

      dạ file link.txt sẽ lưu lại link của job đầu tiên sau khi craw, để lần sau khi crawler thì khi gặp đúng link này sẽ dừng việc craw tránh bị trùng thông tin

      dạ file link.txt sẽ lưu lại link của job đầu tiên sau khi craw, để lần sau khi crawler thì khi gặp đúng link này sẽ dừng việc craw tránh bị trùng thông tin
    • Ngô Trung Hưng @hungnt

      changed this line in version 12 of the diff

      Aug 04, 2020

      changed this line in version 12 of the diff

      changed this line in [version 12 of the diff](https://gitlab.zigexn.vn/hungnt/venjob_nth/merge_requests/2/diffs?diff_id=4928&start_sha=bf921c4cc1782c4d950af2221187a20b98d99dd2#8b9d28e5e4927c2446025b2b64159cc601866739_14_14)
      Toggle commit list
    Please register or sign in to reply
  • Hoang Phuc Do
    @phucdh started a discussion on an old version of the diff Aug 03, 2020
    Last updated by Ngô Trung Hưng Aug 04, 2020
    lib/src/base/base.rb 0 → 100644
    67
    68 def fill_industry_name
    69 industries = page.xpath('//ul//li[position()=2]//p//a').map(&:text)
    70 industries.map(&:strip).join(',')
    71 end
    72
    73 def fill_description
    74 job[:description] = page.search('.tabs .tab-content .detail-row').to_s
    75 end
    76
    77 def check
    78 noname = page.search('//ul//li').text
    79 noname.include?('Kinh nghiệm')
    80 end
    81
    82 def fill_lever
    • Hoang Phuc Do @phucdh commented Aug 03, 2020
      Master

      Sử dụng ternary operator https://github.com/rubocop-hq/ruby-style-guide#ternary-operator

      Sử dụng ternary operator https://github.com/rubocop-hq/ruby-style-guide#ternary-operator
    • Ngô Trung Hưng @hungnt commented Aug 03, 2020
      Master

      Dạ anh

      Dạ anh
    • Ngô Trung Hưng @hungnt

      changed this line in version 12 of the diff

      Aug 04, 2020

      changed this line in version 12 of the diff

      changed this line in [version 12 of the diff](https://gitlab.zigexn.vn/hungnt/venjob_nth/merge_requests/2/diffs?diff_id=4928&start_sha=bf921c4cc1782c4d950af2221187a20b98d99dd2#42748f6402cd5c441c28a62c28e5cc7ad34a9e4f_80_8)
      Toggle commit list
    Please register or sign in to reply
  • Thanh Hung Pham
    @hungpt started a discussion on an old version of the diff Aug 03, 2020
    Last updated by Ngô Trung Hưng Aug 04, 2020
    lib/src/crawler_job.rb 0 → 100644
    1 # frozen_string_literal: true
    2
    3 # Crawler data job
    4 class CrawlerJob < Crawler
    5 SIZE_LI = 8
    • Thanh Hung Pham @hungpt commented Aug 03, 2020

      @hungnt Ý nghĩa ở đây là gì em?

      @hungnt Ý nghĩa ở đây là gì em?
    • Ngô Trung Hưng @hungnt commented Aug 03, 2020
      Master

      Dạ dùng để kiểm tra giao diện, số lượng thẻ

      trong đoạn đó
      Dạ dùng để kiểm tra giao diện, số lượng thẻ <li> trong đoạn đó
    • Ngô Trung Hưng @hungnt

      changed this line in version 12 of the diff

      Aug 04, 2020

      changed this line in version 12 of the diff

      changed this line in [version 12 of the diff](https://gitlab.zigexn.vn/hungnt/venjob_nth/merge_requests/2/diffs?diff_id=4928&start_sha=bf921c4cc1782c4d950af2221187a20b98d99dd2#5dc53d3b8f4caf5a37e54de024cd265df2e112f5_5_3)
      Toggle commit list
    Please register or sign in to reply
  • Thanh Hung Pham
    @hungpt started a discussion on an old version of the diff Aug 03, 2020
    Last updated by Ngô Trung Hưng Aug 03, 2020
    lib/src/crawler_job.rb 0 → 100644
    33 parse_data.each do |path|
    34 page = safe_link(path)
    35 if page.search('.item-blue .detail-box:nth-child(1) ul li:nth-child(1) p')[0].present?
    36 @data = RedInterface.new(page).create_data
    37 elsif page.search('section .template-200').text.present?
    38 @data = BlueInterface.new(page).create_data
    39 elsif page.search('.DetailJobNew ul li').size == SIZE_LI && page.search('.right-col ul li').text.exclude?('Độ tuổi')
    40 @data = GreenInterface.new(page).create_data
    41 end
    42 add_data(@data)
    43 refresh_first_link
    44 end
    45 end
    46
    47 def add_data(data)
    48 id_company = (Company.find_by name: data[:company_name]).try(:id) || COMPANY_SECURITY
    • Thanh Hung Pham @hungpt commented Aug 03, 2020

      @hungnt Chỗ này nếu không tìm thấy company thì em cho job đó default à COMPANY_SECURITY vậy có đúng logic không ta?

      @hungnt Chỗ này nếu không tìm thấy `company` thì em cho job đó default à `COMPANY_SECURITY` vậy có đúng logic không ta?
    • Ngô Trung Hưng @hungnt commented Aug 03, 2020
      Master

      Dạ hầu hết sẽ tìm thấy company của công việc đó, chỉ có cái Bảo mật thì sẽ không có tên công ty và link sẽ là javascript:void(0); nên việc tìm không thấy company cao nhất vẫn nằm ở thằng này nên e mới làm v ạ

      Dạ hầu hết sẽ tìm thấy company của công việc đó, chỉ có cái `Bảo mật` thì sẽ không có tên công ty và link sẽ là `javascript:void(0);` nên việc tìm không thấy company cao nhất vẫn nằm ở thằng này nên e mới làm v ạ
    • Ngô Trung Hưng @hungnt

      changed this line in version 11 of the diff

      Aug 03, 2020

      changed this line in version 11 of the diff

      changed this line in [version 11 of the diff](https://gitlab.zigexn.vn/hungnt/venjob_nth/merge_requests/2/diffs?diff_id=4907&start_sha=33c084b58acc7ba121fbddce1df4987d6e3c8748#5dc53d3b8f4caf5a37e54de024cd265df2e112f5_48_48)
      Toggle commit list
    Please register or sign in to reply
  • Thanh Hung Pham
    @hungpt started a discussion on an old version of the diff Aug 03, 2020
    Last updated by Ngô Trung Hưng Aug 03, 2020
    lib/src/crawler_job.rb 0 → 100644
    49 job = Job.create(name: data[:name],
    50 company_id: id_company,
    51 level: data[:level],
    52 experience: data[:exprience],
    53 salary: data[:salary],
    54 create_date: data[:created_date],
    55 expiration_date: data[:expiration_date],
    56 description: data[:description])
    57 create_industry_relation(data[:industry_name], job)
    58 create_city_relation(data[:city_name], job)
    59 rescue StandardError => e
    60 logger.error "Crawler data jobs has error: #{e}"
    61 end
    62
    63 def create_industry_relation(data, job)
    64 return if data.blank? && id_job.blank?
    • Thanh Hung Pham @hungpt commented Aug 03, 2020

      @hungnt id_job biến này ở đâu ra vậy ?

      @hungnt `id_job` biến này ở đâu ra vậy ?
    • Ngô Trung Hưng @hungnt commented Aug 03, 2020
      Master

      em sai khúc này r, e fix lại

      em sai khúc này r, e fix lại
    • Ngô Trung Hưng @hungnt

      changed this line in version 11 of the diff

      Aug 03, 2020

      changed this line in version 11 of the diff

      changed this line in [version 11 of the diff](https://gitlab.zigexn.vn/hungnt/venjob_nth/merge_requests/2/diffs?diff_id=4907&start_sha=33c084b58acc7ba121fbddce1df4987d6e3c8748#5dc53d3b8f4caf5a37e54de024cd265df2e112f5_64_64)
      Toggle commit list
    Please register or sign in to reply
  • Ngô Trung Hưng @hungnt

    added 1 commit

    • bf921c4c - ..

    Compare with previous version

    Aug 03, 2020

    added 1 commit

    • bf921c4c - ..

    Compare with previous version

    added 1 commit * bf921c4c - .. [Compare with previous version](https://gitlab.zigexn.vn/hungnt/venjob_nth/merge_requests/2/diffs?diff_id=4907&start_sha=33c084b58acc7ba121fbddce1df4987d6e3c8748)
    Toggle commit list
  • Ngô Trung Hưng @hungnt

    added 1 commit

    • 37488f20 - autoload

    Compare with previous version

    Aug 04, 2020

    added 1 commit

    • 37488f20 - autoload

    Compare with previous version

    added 1 commit * 37488f20 - autoload [Compare with previous version](https://gitlab.zigexn.vn/hungnt/venjob_nth/merge_requests/2/diffs?diff_id=4928&start_sha=bf921c4cc1782c4d950af2221187a20b98d99dd2)
    Toggle commit list
  • Ngô Trung Hưng @hungnt

    added 1 commit

    • 3c7e899c - fix autoload

    Compare with previous version

    Aug 04, 2020

    added 1 commit

    • 3c7e899c - fix autoload

    Compare with previous version

    added 1 commit * 3c7e899c - fix autoload [Compare with previous version](https://gitlab.zigexn.vn/hungnt/venjob_nth/merge_requests/2/diffs?diff_id=4929&start_sha=37488f20aecccb60cefc4e474f39f5c7829ab6c8)
    Toggle commit list
  • Ngô Trung Hưng @hungnt

    added 9 commits

    • 3c7e899c...2e9845a6 - 8 commits from branch master
    • 0b319ea9 - Merge branch 'master' into 'crawler'

    Compare with previous version

    Aug 04, 2020

    added 9 commits

    • 3c7e899c...2e9845a6 - 8 commits from branch master
    • 0b319ea9 - Merge branch 'master' into 'crawler'

    Compare with previous version

    added 9 commits * 3c7e899c...2e9845a6 - 8 commits from branch `master` * 0b319ea9 - Merge branch &#x27;master&#x27; into &#x27;crawler&#x27; [Compare with previous version](https://gitlab.zigexn.vn/hungnt/venjob_nth/merge_requests/2/diffs?diff_id=4931&start_sha=3c7e899c31dd7b00da7bdde5905cdc75ce3e507f)
    Toggle commit list
  • Hoang Phuc Do @phucdh

    mentioned in commit 5131650d

    Aug 04, 2020

    mentioned in commit 5131650d

    mentioned in commit 5131650d1109db9b6327fc1b4529a3aaf7c3fa1a
    Toggle commit list
  • Hoang Phuc Do @phucdh

    merged

    Aug 04, 2020

    merged

    merged
    Toggle commit list
  • Write
  • Preview
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or sign in to comment
Assignee
No assignee
Assign to
None
Milestone
None
Assign milestone
Time tracking
3
3 participants
Reference: hungnt/venjob_nth!2
×

Revert this merge request

Switch branch
Cancel
A new branch will be created in your fork and a new merge request will be started.
×

Cherry-pick this merge request

Switch branch
Cancel
A new branch will be created in your fork and a new merge request will be started.