最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

powerbi - Get value from next row when column does not contain consectutive values in Power BI - Stack Overflow

programmeradmin6浏览0评论

I have a table with the following data

record_id site start_date activity_1 activity_2 activity_3 activity_4
10 1 9/24/2022 basketball baseball
10 1 10/1/2022 basketball
10 1 10/3/2022 basketball
10 1 10/11/2022 baseball football
10 1 11/1/2022 football soccer
10 1 12/12/2022

I have a table with the following data

record_id site start_date activity_1 activity_2 activity_3 activity_4
10 1 9/24/2022 basketball baseball
10 1 10/1/2022 basketball
10 1 10/3/2022 basketball
10 1 10/11/2022 baseball football
10 1 11/1/2022 football soccer
10 1 12/12/2022

I want to do two things: combine the activity columns into one column while retaining the same record_id, site and start_date row values and create a new end_date column. The end_date will come from the next start_date after that activity has a blank field. My expected result is below.

record_id site start_date activity end_date
10 1 9/24/2022 basketball 10/11/2022
10 1 10/1/2022 basketball 10/11/2022
10 1 10/3/2022 basketball 10/11/2022
10 1 9/24/2022 baseball 10/1/2022
10 1 10/11/2022 baseball 11/1/2022
10 1 10/11/2022 football 12/12/2022
10 1 11/1/2022 football 12/12/2022
10 1 11/1/2022 soccer 12/12/2022

I've tried this query, but it only gives me the next start_date after the activity, even if the next one is not blank.

let
// Step 1: Load the data
Source = Sheet2,
// Step 2: Unpivot activity columns into one column
UnpivotedColumns = Table.UnpivotOtherColumns(Source, {"record_id", "Site", "start_date"}, "ActivityType", "activity"),

// Step 3: Remove rows where activity is null or empty
FilteredRows = Table.SelectRows(UnpivotedColumns, each ([activity] <> null and [activity] <> "")),

// Step 4: Sort data by record_id, Site, and start_date
SortedTable = Table.Sort(FilteredRows, {{"record_id", Order.Ascending}, {"Site", Order.Ascending}, {"start_date", Order.Ascending}}),

// Step 5: Create a table with DISTINCT start dates
DistinctStartDates = Table.Distinct(SortedTable[[record_id], [Site], [start_date]]),

// Step 6: Add a column to find the next start_date for each group
AddedNextStartDate = Table.AddColumn(DistinctStartDates, "next_start_date", each 
    let 
        CurrentRecordID = [record_id],
        CurrentSite = [Site],
        CurrentStartDate = [start_date],
        // Find the next available start_date
        NextDates = Table.SelectRows(DistinctStartDates, each ([record_id] = CurrentRecordID and [Site] = CurrentSite and [start_date] > CurrentStartDate)),
        NextStartDate = if Table.RowCount(NextDates) > 0 then NextDates[start_date]{0} else null
    in 
        NextStartDate, type nullable date
),

// Step 7: Merge this next_start_date back into the main table
MergedTable = Table.NestedJoin(SortedTable, {"record_id", "Site", "start_date"}, AddedNextStartDate, {"record_id", "Site", "start_date"}, "NextDateTable", JoinKind.LeftOuter),
ExpandedTable = Table.ExpandTableColumn(MergedTable, "NextDateTable", {"next_start_date"}),

// Step 8: Fill in null end_dates with the maximum date (if applicable)
//MaxEndDate = Date.From("12/12/2022"), // Replace with a suitable max date
MaxEndDate = List.Max(Source[start_date]), // Finds the latest date in the Start_Date column
FinalTable = Table.TransformColumns(ExpandedTable, {"next_start_date", each if _ = null then MaxEndDate else _, type date}),

// Step 9: Rename next_start_date to end_date
RenamedTable = Table.RenameColumns(FinalTable, {{"next_start_date", "end_date"}}),
#"Sorted Rows" = Table.Sort(RenamedTable,{{"ActivityType", Order.Ascending}})
in
#"Sorted Rows"

Any help would be appreciated

Share Improve this question edited Mar 31 at 20:37 mcadamsjustin asked Mar 31 at 16:44 mcadamsjustinmcadamsjustin 3754 gold badges12 silver badges23 bronze badges 2
  • Am I correct in assuming that start_date 1/1/2022 should really be 11/1/2022? And why does the first football have an end_date of 10/11/2022. Shouldn't it be 12/12/2022? – Ron Rosenfeld Commented Mar 31 at 19:19
  • This has been fixed – mcadamsjustin Commented Mar 31 at 20:37
Add a comment  | 

1 Answer 1

Reset to default 1

This seems to work, although I have to think there should be a more efficient method of accomplishing this task.

Edit: Code edited to a more efficient version

  • Code should adapt to any number of Activities

Custom Function
(rename as per the comments)
processes one activity at a time

//Rename "fnEndDate"

(tbl as table, act as text)=>

[cn=List.FirstN(Table.ColumnNames(tbl),3),
 a=Table.SelectColumns(tbl,cn & {act}),
 b=Table.FromColumns(
     Table.ToColumns(a) &
     {{null} & List.RemoveLastN(Table.Column(a,act),1)},
     cn & {"Activity","Shifted"}),
 c=Table.ReplaceValue(
     b,
     each [Shifted],
     each [start_date],
     (x,y,z)=>if x=null and y<>null then z else x,
     {"Activity"}),
 d=Table.RemoveColumns(c,{"Shifted"}),
 e=Table.FillDown(d,{"Activity"}),
 f=Table.AddColumn(e,"end_date", each if not (try Date.From([Activity]))[HasError] 
                                        then [Activity] 
                                        else null, type nullable date),
 g=Table.FillUp(f,{"end_date"}),
 h=Table.SelectRows(g, each (try Date.From([Activity]))[HasError]),
 i=Table.TransformColumnTypes(h,{"Activity", type text})
][i]

Main Code

let

//Change next line to reflect actual data source
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{
        {"record_id", type text}, {"site", Int64.Type}, {"start_date", type date} &
            List.Transform(List.Skip(Table.ColumnNames(Source),3), each {_, type any})}),
    #"End Dates" = List.Accumulate(
        List.Skip(Table.ColumnNames(#"Changed Type"),3),
        #table({},{}),
        (s,c)=> Table.Combine({s,fnEndDate(#"Changed Type",c)})
    )
in
    #"End Dates"

Original Data

Results

发布评论

评论列表(0)

  1. 暂无评论